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decisions  affecting  major  programs,  (b)  address  issues  of  significant  concern  to  the 
Executive  Branch,  the  Congress  and/or  the  public,  or  |c)  address  issues  that  have 
significant  economic  implications  IDA  Reports  are  reviewed  by  outside  panels  ol  experts 
to  ensure  Iheir  high  quality  and  relevance  to  the  problems  studied,  and  they  are  released 
by  the  President  ot  IDA. 
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that  they  meet  the  high  standards  expected  ol  refereed  papers  in  professional  journals  or 
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Documents 
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substantive  work  done  in  quick  reactinn  studies.  |b)  to  record  the  proceedings  ot 
conferences  and  meetings.  (c|  to  make  available  preliminary  and  tentative  results  ot 
analyses.  |d|  to  record  data  developed  in  the  course  of  an  investigation,  or  |e|  to  lorward 
information  that  is  essentially  unanalyzed  and  unevalualed.  The  review  ot  IDA  Documents 
is  suited  to  their  content  and  intended  use. 
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Preface 


This  document  was  prepared  by  the  Institute  for  Defense  Analyses  (IDA)  under  the 
Task  Order,  Ballistic  Missile  Defense  Office  Software  Technology  Plan,  and  fulfills  an 
objective  of  the  task,  to  deliver  the  proceedings  of  the  workshop  on  Large,  Distributed,  Par¬ 
allel  Architecture,  Real-Time  Systems.  The  workshop  was  sponsored  by  the  Ballistic  Mis¬ 
sile  Defense  Office  and  NASA  Ames  Research  Center. 

The  Document  was  assembled  and  the  introductory  remarks  were  written  by  Dr. 
Dennis  Fife,  Dr.  Norman  Howes,  Mr.  David  Wheeler,  and  Mr.  Jonathan  Wood.  The  con¬ 
tents  of  the  document  were  furnished  by  the  participants  of  the  workshop,  to  whom  we 
express  our  sincere  appreciation. 
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Foreword 


The  Institute  for  Defense  Analyses  hosted  a  workshop  on  the  software  engineering 
issues  associated  with  large,  distributed,  parallel  architecture,  real-time  systems  from 
March  15th  through  March  19th,  1993.  It  was  jointly  sponsored  by  the  Strategic  Defense 
Initiative  (now  the  Ballistic  Missile  Defense  Organization)  and  NASA  Ames  Research 
Center.  The  workshop’s  purpose  was  to  bring  together  researchers  from  industry  and  aca¬ 
demia  who  are  currently  designing  systems  of  this  type  or  doing  research  that  bears  upon 
real-time  issues  of  interest  to  the  sponsors.  Participation  at  the  workshop  was  by  invitation. 
The  number  invited  was  small  (twenty-three).  The  number  accepting  was  seventeen.  They 
contributed  a  total  of  fifteen  papers.  Also  participating  were  representatives  from  the  spon¬ 
soring  agencies. 

The  participants  were  asked  to  submit  a  position  paper  prior  to  the  end  of  1992  on 
the  following  four  issues;  ( 1 )  What  is  the  best  design  methodology  for  this  class  of  systems? 
(2)  What  is  the  proper  relationship  between  design  theory  and  scheduling  theory?  (3)  What 
is  the  best  method  for  validating  this  class  of  systems?  and  (4)  What  are  the  most  promising 
areas  where  resources  might  be  applied  for  near  term  benefits?  These  position  papers  were 
distributed  to  all  the  participants  on  February  1, 1993  in  order  for  the  participants  to  review 
the  positions  being  taken  by  their  fellow  participants.  At  the  workshop,  each  participant 
was  given  the  opportunity  to  formally  present  his  or  her  position.  However,  much  of  the 
workshop  time  was  devoted  to  informal  discussions  of  the  basic  issues. 

The  proceedings  that  follow  include  several  items.  First,  there  is  a  copy  of  the  work¬ 
shop  agenda  (which  was  not  followed  in  exact  order  due  to  several  on-line  reschedulings 
of  the  formal  talks).  Second,  there  is  a  summary  of  the  discussions  of  the  four  major  issues 
of  the  workshop.  This  summary  was  prepared  by  IDA  workshop  participants  and  thus  may 
not  reflect  the  opinions  of  all  the  participants.  Third,  there  is  a  copy  of  the  fifteen  position 
papers  presented  at  the  workshop.  Fourth,  there  are  copies  of  the  transparencies  for  most  of 
the  formal  presentations.  Copies  of  presentations  by  the  sponsors  were  not  included,  at  their 
request. 
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The  workshop  was  very  helpful  in  summarizing  and  clarifying  the  key  issues  relat¬ 
ed  to  the  class  of  systems  discussed  at  the  workshop.  It  was  also  helpful  in  quantifying  the 
magnitude  of  the  problems  associated  with  the  issues.  On  behalf  of  the  sponsors  and  IDA, 
we  would  like  to  express  our  thanks  to  all  participants  for  sharing  with  us  their  experience 
and  insights. 


Dennis  Fife 
Norman  Howes 
David  Wheeler 


Jonathan  Wood 
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Summary  of  the  Workshop 


Tiic  workshop  revealed  sharp  differences  of  opinion  with  regard  to  the  first  two 
workshop  issues,  namely,  what  is  the  best  method  to  design  large,  distributed  real-time  sys¬ 
tems,  and  what  should  be  the  relationship  between  design  theory  and  scheduling  theory  ? 
The  differences  confirmed  the  often  mentioned  “disconnect”  between  design  theory  and 
scheduling  theory.  Several  real-time  designers  at  the  workshop  indicated  that  they  concen¬ 
trate  on  designing  practical  systems  that  have  acceptable  real-time  behavior,  relying  on 
design  guidelines  they  have  developed  from  experience  that  are  meaningful  to  them.  These 
guidelines  often  have  provisions  for  designing  functionality  into  the  system  other  than  tim¬ 
ing  behavior,  e.g.,  maintainability,  conceptual  clarity,  testability,  etc.  They  then  use  mea¬ 
surement  and  tuning  during  the  software  testing  phase  to  correct  or  improve  their  design’s 
real-time  behavior. 

Scheduling  theorists  at  the  workshop  thought  that  designers  should  concentrate  on 
deriving  abstract  representations  of  processing  in  order  to  apply  an  established  scheduling 
model,  e.g.,  rate  monotonic  scheduling,  that  can  ensure  acceptable  real-time  behavior  (if 
the  abstract  representation  faithfully  models  the  real  system).  If  the  real  sy.stem  fails  to  sat¬ 
isfy  the  necessary  assumptions  and  approximations  of  the  model,  then  the  scheduling  the¬ 
orists  advised  redesign  of  the  application  system  to  better  fit  the  abstract  model.  This 
disconnect  needs  to  be  overcome  through  better  design  methods  and  tools  that  improve 
understanding  and  control  of  real-time  behavior  much  earlier  than  software  testing,  and 
through  scheduling  concepts  that  support  more  general  and  flexible  decision-making.  The 
workshop  discussion  tried  to  identify  specific  strategies  to  “work-around”  the  disconnect 
and  thereby  make  best  use  of  the  guidance  available  from  both  schools  of  thought.  Basical¬ 
ly,  these  work-around  strategies  were  iterative  techniques  to  do  either  a  schedulability  anal¬ 
ysis  of  a  design  or  a  designability  analysis  of  an  abstract  scheduling  representation 
(depending  on  which  point  of  view  one  adopts)  early  in  the  design  phase,  and  to  continue 
refining  the  design  with  schedulability  considerations  as  the  design  progresses. 

On  the  other  hand,  the  workshop  revealed  consensus  on  how  real-time  systems 
should  be  validated.  This  consensus  called  for  designing  code  instrumentation  into  systems 
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for  validation  purposes  and  leaving  it  in  when  a  system  is  fielded.  This  insures  that  the  sys¬ 
tem  timing  is  not  altered  between  validation  and  fielding  and  it  provides  a  “window”  into 
the  operational  system  to  allow'  reproduction  of  event  sequences  leading  up  to  any  system 
failures.  This  is  important  since  real-time  system  failures  are  usually  difficult  to  reproduce 
due  to  concurrent  activities  within  the  system. 

The  workshop  provided  some  ideas  for  investigations  in  the  near  term  that  address 
issues  arising  with  the  new  generation  of  distributed  (and  often  parallel)  real-time  systems. 
Specifically,  it  was  recommended  that  the  proper  application  of  object  oriented  technology 
be  determined  for  real-time  systems  and  that  a  standardized  real-time  Inter-Process  Com¬ 
munication  (IPC)  mechanism  be  developed  that  is  independent  of  system  architecture.  Fur¬ 
ther,  it  was  recommended  that  there  should  be  more  support  (funding)  for  experimentation 
with  (as  opposed  to  creating  more)  real-time  design  methodologies,  in  order  to  quantify 
their  value  by  competitive  design  “shoot  outs”  with  some  objective  party  refereeing  the 
competition. 


Overview  of  Workshop  Discussions 


On  the  issue  of  what  is  the  best  design  method  for  the  class  of  systems  under  dis¬ 
cussion,  there  was  virtually  no  agreement.  Those  who  discussed  design  methods  all  dis¬ 
cussed  different  methods.  Furthermore,  some  of  those  whose  specialty  is  scheduling  theory 
questioned  whether  the  design  methods  presented  were  real-time  design  methods  at  all. 
Theu"  objection  is  that  time  is  not  taken  into  consideration  in  the  design  methods  in  such  a 
way  that  “schedulability  analysis”  can  be  conducted  from  the  very  beginning  of  the  design 
process.  Schedulability  analysis  provides  the  ability  to  determine  if  the  individual  tasks 
comprising  a  system  will  meet  their  deadlines. 

On  the  other  hand,  several  who  have  been  engaged  in  real-time  design  projects  were 
not  concerned  whether  schedulability  analysis  could  be  performed  from  the  beginning  of 
the  design  process.  Everyone  thought  that  some  form  of  schedulability  analysis  was  even¬ 
tually  necessary  (the  extreme  being  to  validate  system  timing  during  system  testing).  It  was 
agreed  during  the  discussions  that  most  real  system  timing  requirements  are  stated  in  terms 
of  end-to-end,  top  level  requirements  rather  than  in  terms  of  deadlines  for  individual  tasks, 
which  are  hardly  ever  known  at  the  time  of  system  specification.  Some  designers  argued 
that  the  chief  concern  during  the  initial  phase  of  design  was  the  structuring  of  the  system  to 
meet  these  top  level  requirements  rather  than  worrying  about  individual  task  deadlines, 
since  the  individual  tasks  might  not  even  be  known  so  early. 

But  almost  without  exception,  the  scheduling  theorists  seemed  to  be  of  the  opinion 
that  the  first  thing  that  needed  to  be  done  was  to  decompose  the  systems  into  tasks  in  order 
that  schedulability  analysis  could  be  performed,  arguing  that  at  all  times  there  needs  to  be 
some  level  of  assurance  (less  certain  at  first,  more  certain  as  the  design  progresses)  that  the 
set  of  tasks  into  which  the  system  is  decomposed  at  the  present  time,  can  meet  their  dead¬ 
lines.  All  the  real-time  design  methods  discussed  at  the  workshop  involved  some  method 
of  using  system  requirements  to  structure  the  design,  often  taking  into  consideration  other 
features  than  simply  the  timing  requirements.  This  leads  to  system  decompositions  (into 
tasks)  that  are  often  hard  to  analyze  using  current  scheduling  theory  tools.  Essentially,  the 
designers  want  to  be  able  to  make  design  decisions  not  necessarily  related  to  timing 
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requirements  and  then  have  scheduling  tools  that  can  accommodate  these  designs,  whereas, 
the  scheduling  theorist  wants  designers  to  work  within  a  framework  that  makes  the  sched¬ 
uling  problem  more  tractable. 

The  end  result  is  the  previously  mentioned  “disconnect”  between  design  theory  and 
scheduling  theory.  The  discussions  revealed  that  this  disconnect  is  very  real  and  needs  to 
be  given  careful  consideration  during  the  design  of  large,  complex,  real-time  systems.  For 
small  single  processor,  real-time  systems,  e.g..  intelligent  controllers  or  low  level  sample 
data  systems,  a  designer  often  can  work  with  current  scheduling  theory.  But  it  must  be  rec¬ 
ognized  that  the  assumptions  of  some  of  the  theories,  e.g.,  rate-monotonic  scheduling  the¬ 
ory,  impose  considerable  restrictions  on  the  design  process  (for  instance,  the  assumptions 
of  a  single  processor,  non-distributed  architecture;  of  periodic  or  possibly  sporadic  tasks; 
of  static  periodicities  or  perhaps  a  small  finite  collection  of  modes;  of  known  task  execution 
times,  etc.),  to  the  point  of  being  unrealistic  to  many  designers.  At  present,  the  prospect  for 
developing  scheduling  theories  that  support  designers,  in  the  way  they  would  like  to  design, 
is  limited.  There  are  some  results  in  this  area  (see  for  instance  the  paper  by  Le  Lann),  how¬ 
ever,  generalized  results  (ones  not  predicated  on  highly  restrictive  assumptions)  appear  to 
be  difficult  to  realize. 

During  the  presentations  and  discussions  of  the  second  day,  it  was  generally  agreed 
that  real-time  design  theory  and  real-time  scheduling  theory  are  related,  but  there  was  no 
agreement  as  to  what  the  relationship  should  be.  Designers  usually  view  schedulability 
analysis  as  a  tool  (one  of  many)  to  be  used  in  the  design  process.  They  are  not  necessarily 
convinced  that  even  though  you  can  apply  some  scheduling  theory  to  the  tasks  comprising 
their  design,  and  even  though  the  theory  predicts  that  the  tasks  are  schedulable,  that  the  real 
system  will  really  behave  this  way  when  it  is  coded  and  tested.  This  is  because  of  the  many 
factors  that  affect  system  timing  that  are  difficult  to  take  into  consideration  when  applying 
the  theory,  especially  when  distributed  or  parallel  processing  elements  are  present  in  the 
design.  What  designers  want  is  a  scheduling  theory  that  takes  into  consideration  the  reali¬ 
ties  of  the  problem  space  they  are  designing  in.  They  do  not  consider  analyses  based  on 
restrictive  assumptions  particularly  relevant  to  the  design  process.  In  fact,  most  of  them  are 
not  sure  what  weight  should  be  attached  to  such  analyses. 

On  the  other  hand,  scheduling  theorists  usually  believe  that  the  proper  timing 
behavior  is  so  critical  in  real-time  systems  (it  is  this  factor  that  distinguishes  them  from  oth¬ 
er  types  systems),  that  the  whole  design  process  needs  to  be  structured  around  schedulabil¬ 
ity  analysis.  For  otherwise,  one  is  likely  to  produce  a  system  that  is  impossible  to  make 
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meet  its  timing  requirements.  This  marked  difference  in  viewpoint  will  likely  continue  to 
prevent  complementary  design  and  scheduling  theories  in  the  near  future. 

During  the  third  day,  surprising  consensus  was  achieved  regarding  the  issue  of  val¬ 
idation  of  real-time  systems.  Essentially  everyone  was  in  agreement  that  in  order  to  validate 
that  real-time  systems  i.ieet  their  timing  requirements,  and  continue  to  meet  their  timing 
requirements  when  fielded,  it  is  necessary  to  instrument  the  code  with  some  type  of  event 
recording  package,  and  that  the  instrumentation  code  remain  a  pan  of  the  fielded  system.  It 
was  agreed  that  most  code  debuggers,  with  which  anyone  had  experience,  alter  the  timing 
of  the  system  under  test  and  are  therefore  of  little  value  in  validation.  Further,  it  w  as  agreed 
that  the  few  general  code  instrumentation  packages  that  exist,  have  too  much  overhead  to 
be  permanently  left  in  the  code.  The  current  state  of  the  practice  is  for  developers  to  build 
in  their  own  test  instrumentation,  using  specific  knowledge  of  the  application  to  achieve 
efficiency. 

This  built  in  test  instrumentation  code  also  provides  a  “window”  into  the  system 
duiiiig  operation.  It  is  constantly  running,  dumping  event  occurrence  information  into  some 
area  of  memory  which  is  overwritten  when  full.  In  the  event  of  a  system  failure,  the  events 
that  occurred  for  some  time  period  in  the  past  (depending  on  memory  size)  are  preserved. 
It  is  only  in  this  way  that  we  can  hope  to  reconstruct  what  happened.  Without  this  tech¬ 
nique,  it  is  generally  not  known  how  to  analyze  timing  failures  in  real-time  systems  since 
they  may  not  be  readily  reproducible. 

During  the  last  day  of  the  workshop,  participants  discussed  both  the  issue  of  where 
resources  should  be  placed  to  try  to  solve  the  design  methodology  and  the  design  vs  sened- 
uling  disconnect  problems  (since  there  was  total  agreement  on  the  best  way  to  validate  real¬ 
time  systems  it  was  no  longer  treated  as  an  issue).  Two  suggestions  that  received  a  good 
deal  of  se.pport  involved  narrower  aims;  namely,  determining  the  role  of  object-oriented 
technology  in  the  real-time  problem  domain,  and  standardizing  real-time  Inter-Process 
Communication  (IPC).  Boih  of  these  suggestions  are  directed  at  pieces  of  the  over-all 
design  process  that  are  becoming  critical  in  current  designs.  Beyond  this,  it  was  recom¬ 
mended  that  there  should  be  more  funding  support  for  experimentation  with  (as  opposed  to 
creating  more)  real-time  design  methodologies,  and  that  this  experimentation  should  seek 
to  compare  and  quantify  the  value  of  these  methodologies  in  various  situations.  Finallj, 
there  was  the  suggestion  that  received  almost  total  support,  that  there  be  support  (funding) 
for  some  software  design  “shoot  outs”  of  one  or  more  representative  real-time  system,  and 
that  some  objective  party  referee  the  competition. 
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Real-time  control  systems  have  been  around  for  a  long  time  in  the  form  of  controllers  for 
closed,  limited  functionality  systems  such  as  those  needed  for  the  control  of  an  appliance.  Such 
controllers  were  implemented  using  analog  devices  and  converted  to  digital  implementations 
through  a  complete  and  detailed  analysis  of  the  closed,  limited  function^ty  environment 
supported  by  such  a  system.  But  in  the  real  world  there  are  many  examples  of  systems:  which  have 
significant  real-time  requirements,  which  must  support  an  open  environment  where  all  applications 
and  their  execution  sequences  are  not  known,  or  are  so  large  that  they  can  not  be  taken  into  account 
in  the  design,  which  have  significant  reliability  and  fault  tolerance  requirements. 

Sometimes  such  systems  have  been  referred  to  as  large,  mission-critical  systems(LMCS).  A 
typical  example  is  the  air  traffic  control  system  Design,  implementation  and  life-time  support  of 
such  systems  pose  many  new  challenges  to  the  designers  and  researchers  in  the  field.  These 
challenges  are  not  only  for  hardware  technology  or  scheduling.  This  problem  requires  an 
integrated  approach  of  hardware,  scheduling,  fault  tolerance,  distributed  operation  in  the  form  of 
system  technology  that  can  support  the  systems  of  tomorrow  systems  effectively. 

A  r^ical  general  purpose  computing  tqrplication  is  characterized  by  the  lack  of  any  timing 
constraints  within  which  it  mu^t  execute.  On  the  other  hand,  real-time  applications  must  execute 
within  timiiig  constraints  which  may  be  soft  or  hard.  Hard  constraints  must  not  be  violated  while 
soft  constraints  may  be  violated  with  defined  {xnalties.  An  LMCS  usually  has  to  support  all  three 
types  of  applications  and  must,  therefore,  provide  means  for  the  timely  execution  of  hard,  soft  and 
non  real-time  applications  while  permitting  controlled  interactions  anaong  such  applications. 

Reliability  and  fault  tol^t  operation  are  very  critical  requirements  for  LMCS.  Such  systems 
must  continue  to  meet  the  timely  execution  requirements  even  when  some  faults  occur.  Many  of 
the  techmques  developed  for  handling  faults  do  not  support  the  timing  requirements  of  a  real-time 
systenL  The  design  of  LMCS  has  to  address  the  conflicting  goals  of  fault  handling  and  real-time 
operations  such  that  both  goals  can  be  met  Further,  the  faults  may  occur  not  only  brrausf  of  the 
failure  of  a  hardware  or  software  component;  they  may  be  caused  by  other  conditions,  such  as 
overload  conditions  resulting  from  inputs  ot  signals  received  beyond  the  designed  capacity  of  the 
system.  It  is  essential  that  the  LMCS  continue  to  function  under  such  conditions,  supporting  at 
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least  the  critical  implications,  and  have  degraded  modes  of  operations. 


Due  to  the  conflicting  requirements  of  real-time  operation  and  fault-tolerant  operation  it  is 
essential  that  the  fault  tolerant  aspects  of  the  system  be  an  integral  part  of  the  system  design  rather 
than  a  separate  add-on  capability. 

In  order  to  support  the  required  functionality  of  operation  from  multiple  locations  and  fault 
tolerance  the  LMCS  has  to  be  implemented  as  a  distributed  system.  A  distributed  system  which 
supports  real-time  as  well  as  fault  tolerance  requirements  poses  many  new  challenges.  Temporal 
coordination  as  well  as  the  synchronization  of  the  time  at  different  machines  has  to  be  supported. 
The  commumcation  among  machines  has  to  be  carried  out  within  the  defrned  time  constraints  and 
the  resources  a  various  sites  have  to  be  managed  in  a  unified  manner. 

The  LMCS  require  comprehensive  solutions  which  can  be  implemented  and  supported  during 
the  life  time  of  the  system  in  an  integrated  manner.  We  believe  that  straight  forward  adaptation  of 
the  techniques  developed  to  address  the  system  design  problems  in  isolation  are  not  likely  to  yield 
such  comprehensive  solutions.  The  life  cycle  support  requires  that  not  only  the  typical  CASE  tools 
be  available  but  many  additional  tools  be  available  which  permit  the  analysis  of  the  resource 
management,  fault  handling  and  timing  requirements  for  the  integrated  system. 

The  development  of  large  systems  require  that  the  techniques  used  be  scalable.  Many 
techiuques  which  work  well  for  small  problems  either  do  not  work  for  large  problems  or  cause  such 
overheads  that  they  can  not  be  used  for  real-time  operation. 

One  example  of  a  project  aimed  at  addressing  the  issues  outlined  above  is  Project  Maruti  at  the 
University  of  Maryland.  The  goal  of  this  effort  is  to  develop  a  machine  independent  system  which 
supports  LMCS.  The  current  prototype  design  irx^lements  Maruti  kernel  on  Mach  with  minimum 
changes  to  the  Mach  kernel.  It  supports  the  multiple  time  domains  of  applications  and  fault 
tolerance  requirements  from  the  lowest  level  to  the  application  overload  handling.  Scheduling  and 
resource  management  techniques  developed  in  this  effort  suppon  the  distributed,  fault  tolerant 
operation  of  the  system  and  are  scalable.  Tools  are  being  developed  to  support  all  the  phases  of 
the  life  cycle  of  an  application. 
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Abstract  The  interrelated  subjects  of  the  desi^  and  the  validation  of  large  scale, 
distributed  lealdme  systems  are  critically  important  to  the  high  volume, 
globd.  commercial  process  control  industry.  A  key  requirement  for  these 
mission  critical  control  systems  is  scaleability  with  respect  to  such 
attributes  as  functionally,  predictable  performance,  degree  of  distribution, 
fault  tolerance,  and  serviceability.  The  thesis  of  this  paper  is  that  scaleable 
commercial-grade  realtime  systems  do  not  presently  exist,  even  though  the 
global  process  control  market  demands  such  systems,  lire  requisite  base 
technologies  exist  in  various  forms,  some  mature,  some  embryonic,  but 
their  synthesis  into  stable,  reusable,  platform  products  has  not  received  the 
attention  of  cotnnrercial  suppliers.  This  has  retarded  the  acquisition  of  a 
body  of  experience  and  the  development  of  associated  tools  to  support  the 
needed  system  design  theories,  specification  and  development 
environments,  and  verification  and  valit^on  methodologies. 

The  invitation  to  the  **Wotkshop  on  Large,  Distributed,  Parallel  Architecture  RealTime 
Systems"  (WorLDPARTS  ?)  defines  four  major  issues  to  which  are  offered  the  following 
positions.  The  context  of  tiiis  response  is  defined  by  the  requirements  of  large  scale 
"mission  critical"  plant  and  process  control  applications  that  are  routinely  foui^  in  the 
global  continuous  process  manufacturing  industries.  These  industries  include  chemical, 
petrochemical,  mining  and  minerals,  steel,  pharmaceutical,  electric  utility,  food  and 
beverage,  waste  water  treatment,  and  pulp  and  paper  segments. 

Plant  and  process  control  systems  for  these  segments  involve  thousands  of  VO  points 
(Level  0  transducers),  hundr^  of  regulatory  control  loops  (Level  1  cell  controls),  tens  of 
supervisory  controls  (Level  2  inter-cell,  or  area  controls),  one  or  more  sets  of  plant  level 
controls  (Level  3  inter-area,  or  plantwide  controls),  and  one  or  more  sets  of  inter-plant 
controls  (Level  4  enterprise  controls).  Mission  critical  control  applications  comprise 
those  hardware/softwaie  systems  that  have  primary  responsibility  for  plant  and  process 
production-,  safety-,  quality-,  and  regulatory  agency  reporting-related  automation. 

Within  this  application  domain,  the  notions  of  "large,  distributed,  parallel  architecture, 
realtime  systems"  has  very  si^ific  meaning.  “Large"  implies  both  wide  physical 
distribution  (e.g.,  a  campus  setting),  multiple  nodes  or  network  "end-systems"  (e.g.,  20- 
100  computational  elements),  and/or  lo^ci^y  or  physically  conq>lex  node  configurations. 
“Distributed"  connotes  both  that  applications  0«c.,  execution  threads)  span  nodes,  and 
nodes  are  physically  isolated,  often  geographic^y  close  to  the  manufacturing  processes 
they  are  responsible  for  controlling.  “Parallel  architecture"  indues  computing  elements 
that  are  either  uni-processors  tiiat  operate  in  n-for-one  redundant  sets,  or  multi-processors 
that  operate  in  loosely-  or  tightly-coupled  arrays  with  local  and/or  shared  global 
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memories  (e.g.,  symmetric  SISD  or  SIMD  configurations).  And  fmally  “realtime” 
implies  that  command  and  control  application  tasks  must  operate  witl^  specific  “hard” 
[realjtimc  constraints  that  must  be  guaranteed  as  part  of  the  ’’correct”  operation  of  the 
systenv 

Althou^  not  mentioned  in  the  invitadon.  we  believe  that  a  critically  important  attribute 
of  such  systems  is  “scaleability”.  Ideally,  we  would  be  able  to  allow  functionality, 
performance,  timeliness,  predictability,  decentralization,  and  fault  tolerance  to  be 
sc^eable  attributes  of  distributed  realtime  systems.  System  elements  should  be  able  to 
scale  upwards  from  small  sets  of  command  iod  control  services  to  large  sets  with  known 
cost  and  complexity  measures.  Scaleable  services,  if  properly  inq)lemented,  would 
provide  a  tangible  basis  for  building  “correct”  systems  from  the  ground  up,  realizing  a 
commercially  important  ability  to  incrementally  expand  the  system  while  preserving  its 
basic  coirecmess. 

The  commercial  “distributed  control  system”  (DCS)  business  celebrates  its  15th  birthday 
this  year,  and  is  now  in  its  second  or  third  generation  of  technology,  begiiming  with 
centralized  minicomputer-based  data  acquisition  rmd  control  (SC  AD  A)  systems  and 
culminating  in  networked,  microprocessor-based  distributed  computing  systems.  The 
essential  nuicro  elements  of  today’s  systems  include  I/O  subsystems,  fault-tolerant 
process  controllers  and  data  servers,  and  high  perform^ce,  graphics-oriented  human 
interfaces  providing  command  and  control  console  functions.  These  elements  typically 
operate  over  tens  of  kilometer  distances  at  1-10  Mbps  on  redundant  fiber,  twinax,  or  coax 
backbone  networks  governed  largely  by  proprietary  protocols.  The  base  technologies 
used  within  these  contemporary  macro  pr^ucts  is  changing  r^idly,  and  over  the  short 
term  will  look  conservatively  something  like  the  following. 

C^uit  densities  are  increasing  at  about  25%  per  year,  doubling  eveiy  three  years. 
Device  speeds  are  increasing  at  a  similar  rate.  This  is  equivalent  to  realizing  the  same 
device  fimctionality  in  half  the  space  at  twice  the  spe^  evep'  three  years.  As  a  related 
development,  the  cost  per  processor  instruction  cycle  is  decli^g  at  25%  per  year.  This 
yields  100%  addition^  processing  capacity  (oi^rating  at  twice  the  speed  in  half  the 
space)  for  the  same  cost  every  three  years.  The  basis  today  is  25  MHz  machines.  By  the 
mid-life  of  a  new  system  we  will  be  able  to  use  200  MHz  processors  in  the  same  physical 
space  and  at  the  sariK  prices  as  today’s  machines. 

The  cost  of  memory  is  declining  at  15%  per  year,  dropping  by  a  factor  of  two  every  five 
years.  DRAM  densities  are  increasing  at  alMut  60%  per  year,  quadrupling  every  three 
years.  Therefore,  in  the  span  of  just  10  years  we  should  see  twelve  times  the  memory 
density  at  one  quarter  the  cost  At  the  same  time,  application  address  space  is  teing 
consumed  at  one  additional  address  bit  p^  year,  on  average,  suggesting  we  need  an 
additional  10  bits  of  address  over  the  design  hatf-life  of  a  new  machine.  In  today’s 
control  systems  we  use  about  17  bits  of  address  space  per  Level  0  device,  21  bits  per 
Level  1  device,  23  bits  per  Level  2  device,  and  24  bits  per  Level  3  device.  By  the  year 
2005  we  estimate  that  Level  0  devices  will  utilize  26  adt^s  bits,  with  32  bits  at  Level  1, 
34  bits  at  Level  2,  and  36  bits  at  Level  3.  Qearly,  64-bit  processors  ate  required  to 
implement  the  upper  domains  of  the  next  generation  of  mac^es. 

Disk  density  is  increasing  about  25%  per  year,  doubling  in  three  years.  This  keeps  pace 
with  the  consumption  of  DRAM,  and  suggests  tiiat  over  the  life  of  the  system  secondary 
storage  demands  will  increase  for  two  principle  reasons.  First,  backing  storage  is 
lequi^  to  contain  (at  least  part  oO  the  static  images  of  the  Level  1  through  Level  4 
machinery  .  Second,  significant  archival  storage  is  required  to  log  tiie  operating  history  of 
the  plant  For  example,  a  plant  with  1,(XX)  field  measurements  sampled  at  1  Hz  would 
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produce  a  raw  Level  0  data  rate  of  64  Kbps,  assuming  64  bits  per  point  (data,  plus  status, 
plus  time  stamp).  TTiat  represents  a  potential  uncompressed  Level  0  storage  requirement 
of  over  2  Terabits  per  year,  or  250  Mbytes  per  point  per  year.  Assuming  an  average 
compression  factor  of  .6.  we  can  estimate  an  appetite  of  150  Mbytes  per  point  per  year  of 
reqiui^  archival  storage  capacity. 


Available  communications  bandwidth  is  increasing  by  a  factor  of  10  every 
Its  basis  today  is  10  Mbps,  yielding  100  Mbps  by  1995,  and  10  Gbps  by  2005.  Tlus 
bandwidth  is  expe^  to  be  absorbed  for  a  number  of  reasons,  primary  at  automation 
Levels  2  ^  3,  including:  i)  the  routine  use  of  multimedia  man-machine  interfaces  that 
suTOort  irxgr^  voice  and  full-frame  video  display  systems;  and  ii)  the  increasing 
utilization  of  optical  sensors.  These  sensors  have  application  in  many  control  dom^, 
but  when  used  for  hi^  speed  flat  sheet  production  (such  as  steel,  film  and  paper  makmg) 
can  produce  enormous  volumes  of  data  in  very  short  periods. 


This  brief  summary  suggests  that  by  the  end  of  the  design  half-life  of  the  next  generation 
of  plant  control  systems  (circa  2005)  the  computational  elements  will  routinely  operate  at 
200-400  MHz,  support  address  spaces  of  30-40  bits,  intercommunicate  at  1-10  Gigabits 
per  second  over  optical  paths,  collectively  track  and  control  an  evolving  plant  state 
comprising  over  10^  objects,  and  utilize  Terabyte  backing  storage  subsystems.  This 
expectation  points  to  the  real  system  design  problem  —  software  —  its  creation, 
configuration,  deployment  and  maintenance. 


With  these  few  remarks  as  background  the  Issues,  as  defined  in  the  IDA  invitation,  are 
addressed  in  corresponding  Position  statements.  These  Positions  are  admittedly  strategic 
and  terse  in  nature.  Technical  details  would  get  messy.  It  is  anticipated  that  the 
following  comments  will  provide  sufficient  material  to  begin  more  in  depth  discussions. 


Issue  1  “What  is  the  best  method  or  methodology  for  designing  large,  distributed 
realtime  systems  where  processing  elements  may  have  a  parallel 
architecture?" 

Position  1  First,  distributed,  realtime  control  systems  that  utilize  computational 
elements  based  on  parallel  architectures  also  generally  support  non- 
parallel  architecture  elements  within  the  same  design.  The  realtime  namre 
of  such  systems  demands  that  the  disttibuted  thread  of  control  cany  with 
them  scheduling  semantics  which  guarantee  predictable  (i.e.,  timely) 
performance  across  both  classes  of  elements.  Therefore,  both  the  parallel 
and  non-parallel  architecture  elements  of  the  distributed  system  must 
support  the  same  virtual-machine  (i.e.,  VM  or  kernel)  services  that  provide 
for  the  realtime  “transnode"  threads.  This  requires  an  underlying  realtime 
IPC  mechanism  that  supports  both  the  parallel-architecture  (i.e.,  tightly 
coupled,  shared  memory  environment)  elements  as  well  as  the  non-parcel 
architecture  (i.e.,  loosely  coupled,  message-based)  elements  of  the 
distributed  system 

Second,  to  support  the  development  of  scaleable  end-use  applications  that 
implement  the  mission  critical  control  policies  of  the  system,  the 
applications  must  be  modular,  and  implemented  in  such  a  manner  that  they 
may  be  bound  (compiled,  linked,  and/or  interpreted)  onto  one  or  more 
processing  elements  of  the  distributed  system.  This  suggests  a  set  of 
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fonnal  methods  are  required  that  are  inherently  “object-oriented”  in 
nature,  at  least  in  specification  and  design,  if  not  in  implementation. 

These  and  other  cogent  reasons  suggest  two  important  rules  for  dwigning 
large,  distributed  realtime  systems:  i)  separate  policies  required  for 
system  coordination  and  management  from  mechanisms  used  to 
implement  them  within  and  across  various  system  elements,  and  ii)  clearly 
septfate  the  mission  critical  applications  from  the  imderlying  virtual 
machine  by  implementing  formal  application  program  interfaces  (API’s) 
which  enforce  the  system  design  rules. 

The  separation  of  system  coordination  and  management  policies  and 
mechanisms  allows  for  the  “objectification”  of  the  underlying  system 
elements  while  allowing  applications  to  implement  for  themselves  selected 
policies.  This  partitioning  supports  the  (potenti^y  dynamic)  mot^cation 
of  policies  as  the  system  runs,  providing  for  adaptation  and 
reconfiguration  of  control  regiines  as  external  conditions  and/or  internal 
system  faults  warrant  It  also  allows  for  different  mechanisms  to  be  used 
by  different  system  elements  to  enforce  the  same  system  policies. 

The  API’s  provide  for  stable  interprocess  (inter-object)  semantics  from 
which  verification  and  validation  can  proceed.  The  interface(s)  may 
promote  the  memory  or  processor  models  of  underlying  computing 
elements  (i.e.,  shared  global  address  space  vs.  local  memories  holding 
message-ba^  agents),  depending  on  the  needs  of  the  computations.  It  is 
my  contention  that  with  the  sp^  of  processing  elements,  the  size  of 
available  memory  subsystems,  and  the  speed  of  interconnea  networks,  the 
justification  for  shared  memory  is  arguable.  A  fundamental  design  issue 
for  scaleable.  distributed,  realtime  systems  is  location  transparency.  The 
means  to  achieve  it  are  policy  and  mechanism  issues. 

The  parallel-architecture  elements  of  the  distributed  system  may  well 
require  a  shared  address  space  to  carry  out  their  local  computations.  With 
the  availability  of  64-bit  microprocessors  (e.g.,  DEC’S  Alpha_AXP)  and 
high  speed  mesh  interconnea  structures  (e.g.,  CHPC’s  GalacticaNet)  there 
are  a  number  of  proposed  parallel  architectures  that  utilize  a  flat,  global 
directly  ad^ssed  memory.  While  of  some  academic  interest,  this  design 
does  not  meet  the  needs  of  a  scaleable.  mission  critical  distributed  control 
systems. 


Issue  2  “What  should  be  the  relationship  between  realtime  design  theory  and 
realtime  scheduling  theory  in  a  design  methodology  for  this  class  of 
systems?” 

Position  2  Their  are  a  riumber  of  concepts  associated  with  this  issue  that  are  system 
platform  (i.e.,  virtual  machine)  related,  and  a  number  which  properly 
belong  to  the  control  application  problem  domain.  The  first  point  I  would 
make  here  is  that  the  two  domains  are  different  and  must  be  clearly 
understood.  Too  often  the  issues  of  scheduling  are  considered  to  be  within 
the  domain  of  the  host  operating  system  when  they  more  properly  belong 
to  the  set  of  application  domain  objects  responsible  for  some  mission 
critical  function.  Again,  the  design  issue  here  is  the  clear  separation  of 
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mechanisms  provided  by  the  underlying  VM  in  support  of  application- 
level  policies  required  to  conqjlete  some  task  or  transaction  in  a  timely 
manner. 

The  implication  of  this  point  of  view  is  to  make  scheduling  parameters 
part  of  the  application  object’s  capability  list  The  activation  (i.c., 
invocation)  of  an  object  will  then  dynamically  associate  its  scheduling 
parameters  with  the  undcrlyuig  VM  services  responsible  for  the  thrcad(s) 
executing  within  the  object’s  address  space.  A  dual  of  this  viewpoint  is  to 
associate  scheduling  parameters  with  a  computation’s  (i.e.,  process  or 
task)  execution  thre^  at  the  point  of  its  creation.  These  parameters  then 
“follow”  the  thread  as  it  meanders  through  the  distributed  system  in 
pursuit  of  named  objects  (e.g.,  agents  or  actors)  whose  services  arc 
requir^  of  the  computation.  In  both  cases  the  scheduling  of  the  object  (or 
the  scheduling  of  the  task  thread  within  the  object)  is  governed  by  the 
semantics  of  the  application  as  opposed  to  the  operating  system  of  the 
particular  element  executing  the  object’s  methods. 

Here  the  realtime  design  issue  is  split  into  two  parts:  i)  the  scheduling 
policies  required  by  the  application  to  meet  its  time  requirements  (e.g., 
deadline,  highest  priority,  or  best  effon),  and  ii)  the  mechanisms  to  be 
provided  by  the  underlying  VM  for  enforcing  a  formal  and  predictable 
allocation  of  system  resources  for  realizing  the  policies  within  a 
populadoo  of  (potentially)  competing  threads. 


Issue  3  “What  is  the  best  method  for  validating  tlut  large,  distributed,  parallel 
architecture  realtime  systems  behave  as  specified?” 

Position  3  This  remains  an  open  question.  There  are  the  classical,  conservative 
methods  of  component,  subsystem,  and  ensemble  testing  to  define  the 
execution  profiles  of  the  system’s  VM  services  and  application  objects 
under  “typical”  operating  conditions.  There  is  “scenario  analysis”  based 
on  certain  assumptions  about  the  probabilities  of  system  events  (e.g., 
internal  faults,  external  events),  often  based  on  Monte  Carlo  simulations 
and  the  like.  And  for  sufficiently  small  systems,  there  are  correemess 
proofs  and/or  exhaustive  testing. 

But,  in  general,  the  problem  of  designing  a  large,  distributed,  realtime 
system  that  can  be  tested  and  verified  “correct”  under  all  of  the  non- 
deterministic  operating  conditions  it  might  face  is  difficult  In  support  of 
our  efforts  at  designing  high-volume,  commercial-grade,  distributed 
control  systems  that  govern  the  behavior  of  hazardous  (e.g.,  chemical, 
refining,  power  production,  pharmaceutical)  processes,  this  issue  is 
receiving  a  great  deal  of  attention. 

Current  systems  achieve  validation  through  1)  static  configurations,  2) 
limited  applications  functionality,  3)  conservative  implementations,  4)  and 
exhaustive  testing.  ’These  techniques  work  relatively  well  for  control  of 
low  level  regulatory  loop  control  problems.  Hiey  will  not  be  sufficient  to 
handle  the  laiger  application  domains  envisioned  for  our  next  generation 
of  plant  control  systems.  In  current  systems,  critical  realtime  applications 
reside  completely  within  a  single  node  (element)  in  the  distributed  system. 
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Id  the  next  generation,  applications  will  span  nodes  and  require 
‘‘transnode"  vt^dation  of  performance. 

The  means  to  achieve  predictable,  correct  performance  of  distnbuted 
applications  is  through  strict  adherence  to  encapsulation,  reuse,  and 
location  transparency  in  design;  and  to  implement  system  elements  with 
clearly  detin^  interfaces  that  guarantee  that  parameters  crossing  the 
boundary  are  consistent  (e.g.,  type  and  range  checking).  Invocation  side 
effects  must  be  bounded.  System  messages,  from  as;pchronous  fine  grain 
signals  (e.g.,  intemipts)  to  synchronous  coarse  gn^  IPC  messages  (i.e. 
request-reply  semantics),  must  be  universally  undmtood. 

With  these  facilities  in  place,  supported  by  appropriate  directory  (e.g., 
request  broker)  services,  the  distributed  nature  of  the  system  may  be 
understood  by  verification  and  validation  test  suites  applied,  incrementally 
and  iteradvely,  to  selected  ensembles  of  system  objects.  These  various 
subsets,  whose  interactions  provide  for  key  mission  critical  services  of  the 
distributed  realtime  environment,  can  be  validated  prior  to  their 
engagement  in  the  larger  population  whose  collective  behavior  is  even 
less  deterministic. 

These  are  basic  principles  behind,  and  motivations  for,  object  oriented 
systems.  And  they  are  well  understood,  in  principle.  What  is  missing  are 
formal  and  consistent  sf^iflcation  method,  objea  interface  (message) 
semantics,  and  the  associated  tools  and  techniques  for  realizing  the  V&V 
functions  expressed  above.  There  are  a  numixr  of  candidate  disciplines 
(e.g.,  IDEF-based),  but  they  are  not  generally  adequate  for  large  control 
systems  which  exhibit  asynchronous,  non-deterministic  behavior.  They 
are  better  suited  to  large  “transaction-oriented”  systems  whose  request- 
reply  semantics  are  well  defined  (e.g.,  banking  ATM’s). 


“Given  that  resources  were  available  to  enhance  the  design  and  testing 
methodologies  for  this  class  of  systems,  what  are  the  most  promising  areas  • 

where  these  resources  could  be  applied?” 

The  area  where  current  technologies  and  practices  are  most  deficient  is  in 

the  specification  and  implementation  of  task  completion-time  constraints, 

especially  when  these  computations  are  carried  by  transnode  threads.  The 

problem  is  difficidt  in  a  single  node  system,  whether  that  node  be  uni-  or  ® 

multi-processor  in  construction.  The  problem  of  distributed  realtime 

guarMtees  is  especially  difficult  because  the  underlying  VM’s  do  not 

provide  the  realtime  EPC  (RPQ  mechanisms  that  provide  for  the  requisite 

services.  CMU’s  Archon  Project,  and  its  Alpha  OS,  provides  an  excellent 

example  of  an  implemented  solution  for  this  general  problem,  but  is 

suffers  from  i)  not  being  a  commercial  system,  ii)  not  being  supported  by  * 

formal  methods  and  tools,  and  iii)  being  avant  garde. 

The  availability  of  commercial-grade  facilities  must  wait  for  OSF’s 
realtime  extensions  to  its  MACH  microkerael,  extensions  to  Chorus,  next 
generations  of  Sun’s  Solaris,  BM’s  rationalization  of  OS/2  and  ADC,  and  ^ 

so  on.  These  distributed  “object-oriented  operating  systems”  will  likely 
see  commercial  application  in  the  late  90 ’s.  Their  introduction  will 
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encourage  the  development  of  tools,  object  libraries  with  standard 
interface  semantics  (c.g.,  CORBA,  DOMF),  and  a  body  of  experience 
from  which  to  develop  formal  methods. 

The  most  promising  areas  for  support  are  i)  the  specification  and 
development  of  standardized  retime  transnode  IPC  services  which  can 
carry  scheduling  informatioo,  ii)  policies  and  mechanisms  to  provide  for 
aggressively  best  effort  scheduling  of  ensembles  of  threads  executing  on  a 
given  (uni-  or  multi-processor)  node  within  a  distributed  system,  and  iii) 
methods  and  tools  for  expressing  (in  the  functional  requirements,  design, 
and  V&V  documentation)  precisely  the  completion-time  performance 
required  of  a  distributed  computation.  This  includes  he  effects  on  the 
external  environment  and  the  internal  state  of  the  distributed  system  of 
being  early,  on  time,  or  late. 
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1.  INTRODUCTION 

The  following  material  considers  the  four  issues  raised  in  the  Invitation  to  Attend: 

•  What  is  the  best  method  or  methodology  for  designing  large,  distributed  real-time  sys¬ 
tems  where  processing  elements  may  be  parallel  architectures? 

•  What  should  be  the  relationship  between  real-time  Design  Theory  and  real-time  Schedul¬ 
ing  Theory  in  a  design  methodology  for  this  class  of  system? 

•  What  is  the  best  method  for  validating  that  large,  distributed,  parallel  architecture  real¬ 
time  systems  behave  as  specified? 

•  Given  that  resources  were  available  to  enhance  the  design  and  testing  methodologies  for 
this  class  of  system,  what  are  the  promising  areas  where  these  resources  could  be 
applied? 

In  giving  a  more  detailed  response  to  the  first  question  a  number  of  the  other  issues  are 
addressed.  We  focus,  in  this  position  paper,  on  non-functional  issues  such  as  timeliness  and 
dependability. 

We  assume  that  the  allocation  of  software  objects  to  the  set  of  distributed  nodes  is  essen¬ 
tially  static  (i.e.  is  undertaken  before  execution  and  then  remains  unchanged  unless 
reconfiguration  following  significant  failure  is  en^loyed).  Within  a  node,  where  parallel  archi¬ 
tectures  may  be  employed,  allocation  is  dynamic.  We  also  assume  that  the  distributed  nodes 
are  linked  by  shared  communication  media. 

2.  DESIGN  METHODS 
2.1.  A  Design  Framework 

The  most  important  stage  in  the  development  of  any  real-time  system  is  the  generation  of  a 
consistent  design  that  satisfies  an  authoritative  specification  of  requirements.  Where  real-time 
systems  differ  from  the  traditional  data  processing  system  is  that  they  are  constrained  by 
certain  non-functional  requirements  (e.g.  dependability  and  timing).  Typically  the  standard 
structured  design  methods  do  not  cater  well  with  these  types  of  constraints  . 

It  is  increasingly  recognised  that  the  role  and  importance  of  these  non-fiinctional 
requirements  in  the  develt^nnent  of  complex  critical  applications  has  hitherto  been 
inadequately  appreciated^.  Specifically,  it  has  been  ctnunon  practice  for  system  developers, 
and  the  methods  they  use,  to  concentrate  primarily  on  functionality  and  to  consider  non¬ 
functional  requirements  comparatively  late  in  the  development  process.  Experience  shows  that 
this  approach  fails  to  produce  dependable  real-time  systems.  For  example,  oftoi  timing 
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Ttquirements  arc  viewed  simply  in  terms  of  the  performance  of  the  completed  system.  Failure 
to  meet  the  rcquircd  performance  often  results  in  ad  hoc  changes  to  the  system.  This  is  not  a 
cost  effcv  d-  e  process. 

If  hard  real-time  systems  arc  to  be  engineercd  to  high  levels  of  dejjendability,  the  real¬ 
time  design  method  must  provide: 

•  the  explicit  recognition  of  the  types  of  activities/objects  that  arc  found  in  hard  real-time 
systems  (i.e.  cyclic  and  sporadic  activities); 

•  the  explicit  definition  of  the  applicadon  timing  requirements  for  each  object; 

•  the  explicit  definidon  of  the  applicadon  relialnlitv  requirements  for  each  object; 

•  the  definidon  of  the  reladve  importance  (cridcality)  of  each  object  to  the  successful 
fiincdoning  of  the  applicadon; 

•  the  support  for  different  modes  of  operadon  —  many  systems  have  different  modes  of 
operadon  (e.g.,  take-off,  cruising,  and  landing  for  an  aircraft);  all  the  dming  and 
importance  characterisdes  will  therefore  need  to  be  specified  on  a  per  mode  basis; 

•  the  explicit  definidon  and  use  of  resource  control  objects; 

•  the  decomposidon  to  a  software  architecture  that  is  amenable  to  processor  allocation, 
schedulability  and  dming  analysis; 

•  facilities  and  tools  to  allow  the  schedulability  analysis  to  influence  the  design  as  early  as 
possible  in  the  overall  design  process; 

•  restriction  on  the  use  of  the  implementation  language  so  that  worst  case  execution  time 
analysis  can  be  carried  out; 

•  tools  to  perfonn  the  worst  case  execution  time  and  schedulability  analysis. 

A  constructive  way  of  describing  the  process  of  system  design  is  as  a  progression  of 
increasingly  specific  commitmenty’^.  These  commitments  define  properties  of  the  system 
design  which  designers  operating  at  a  more  detailed  level  are  not  at  liberty  to  change.  Those 
aspects  of  a  design  to  which  no  commitment  is  made  at  some  particular  level  in  the  design 
hierarchy  arc  effectively  the  subject  of  obligations  that  lower  levels  of  design  must  address. 
Early  in  design  there  may  already  be  commitments  to  the  structure  of  a  system,  in  terms  of 
object  definitions  and  relationships.  However,  the  detailed  behaviour  of  the  defined  objects 
remains  the  subject  of  obligations  which  must  be  met  during  further  design  and 
implementation. 

The  process  of  refining  a  design  —  transforming  obligations  into  commitments  —  is 
often  subjea  to  constraints  imposed  primarily  by  the  execution  environment.  The  execution 
environment  is  the  set  of  hardware  and  software  components  (e.g.  processors,  task  dispatchers, 
device  drivers)  on  top  of  which  the  system  is  built  It  may  impose  both  resource  constraints 
(e.g.  processor  speed,  communication  bandwidth)  and  constraints  of  mechanism  (e.g.  interrupt 
priorities,  task  dispatching,  data  locking).  To  the  extent  that  the  execution  environment  is 
immutable  these  constraints  are  fixed. 

Obligations,  commitments  and  constraints  have  an  important  influence  on  the 
architectural  design  of  any  application.  We  therefore  define  two  activities  within  architectural 
design^: 

•  the  logical  architecture  design  activity; 


•  the  physical  architecture  design  activity. 

The  logical  architecture  embodies  commitments  which  can  be  made  independently  of  the 
constraints  imposed  by  the  execution  environment,  and  is  primarily  aimed  at  satisfying  the 
fiuicdonal  requirements.  The  physical  architecture  takes  these  and  other  ccHistraints  into 
account,  and  embraces  the  non-fiuicdonal  requirements.  The  physical  anchitecture  forms  the 
basis  for  asserting  that  the  applicadon’s  ncxi-funcdonal  requirements  will  be  met  once  the 
detailed  design  and  iiTq)lementadon  have  taken  place.  It  addresses  timing  and  dependability 
requirements,  and  the  necessary  schedulability  analysis  that  will  ensure  (guarantee)  that  the 
system  once  built  will  function  correctly  in  both  the  value  and  time  domains  (within  some 
failure  hypotheses). 

Requirements  Definition 


♦ 


Logical  Architecture  Design 


f 

Physical  Architecture  Design 
(Schedulability  Analysis) 


t 

Detailed  r>esign 


Coding  including 
Code  Timing  Estimations 


Testing  including 
Code  Timing  Measurments 


Execution  Environment 
Constraints 


Execution  Environment 
Constraints 


Figure  1;  The  Hard  Real-time  Life  Cycle 

Figure  1  gives  an  overview  of  the  proposed  framework.  It  is  important  to  note,  however,  that 
this  figure  does  not  identify  phases  (in  a  traditional  waterfall  model)  but  (potentially 
concurrent)  stages  (activities).  The  output  of  each  stage  is  a  "product"  that  can  be 


13 


independently  evaluated.  Consistency  of  notation  between  products  clearly  improved  the 
design  process  (sec  the  following  discussions  on  computational  models  and  HRT-H(X)D). 

2.1.1.  Logical  Architectural  Design 

There  arc  two  aspects  of  any  design  method  which  facilitate  the  logical  architecture  design  of 
hard  real-time  systems.  Firstly,  explicit  support  must  be  given  to  the  abstractions  that  arc 
typically  required  by  hard  real-time  system  designers.  IVe  take  the  view  that  if  designs  are  to 
be  well  structured  so  that  they  can  be  analysed,  then  it  is  better  to  provide  specific  design 
guidelines  rather  then  general  design  abstractions.  For  example,  supporting  the  abstraction  of 
a  periodic  activity  allows  the  structure  of  that  acnviiy  to  be  visible  to  the  design  process  which, 
in  turn,  facilitates  its  analysis.  In  contrast,  allowing  the  designer  to  construct  periodic 
activities  out  of  some  more  primitive  "task”  activity  produces  designs  which  are  more  difficult 
to  analyse.  Qearly,  care  must  be  taken  so  that  the  design  method  does  not  become  cluttered 
with  too  many  abstractions  but  an  adequate  level  of  support  is  desirable. 

The  second  aspect  involves  consuaining  the  logical  architecture  so  that  it  can  be  aUocated 
to  a  distributed  system  and  analysed  during  the  physical  architecture  design  activity.  One 
means  of  achieving  this  is  to  choose  an  appropriate  computational  model.  Concurrency  is 
obviously  an  important  abstraction  within  any  such  model.  We  do  not,  however,  believe  that  a 
synchronous  communication  model,  such  as  that  contained  in  CSP  or  CCS,  is  the  correct 
abstraction  for  real-time  systems.  Rather  we  support  an  asynchronous  model  in  which  active 
objects  (i.e.  object  that  can  give  rise  to  spontaneous  computation)  interact  via  asynchronous 
messages  or  non-active  objects  (which  may  be  either  entirely  passive  or  provide  some  form  of 
protection  over  the  data  being  communicated  between  the  active  objects;  e.g.  mutual 
exclusion). 

This  computational  model  is  used  in  the  Mascot- 3  design  method  and  the  formal 
method  TAM^  ^  We  have  also  embodied  the  nxxlel  into  the  HRT-HCX)D  design  method,  see 
section  2.1.3. 

2.12.  Physical  Architectural  Design 

The  primary  focus  of  this  activity  is  the  allocation  of  objects  to  the  distributed  system  and  the 
analysis  of  the  worst  case  response  times  for  transactions  (precedence  related  objects)  running 
through  the  distributed  system.  To  achieve  this  the  logical  architecture  must  identify  the  run¬ 
time  objects  of  the  application  and  give  estimates  of  their  resource  needs  (CPU  cycles, 
conununication  load  etc).  The  available  resources  in  the  execution  environment  must 
obviously  also  be  known. 

Rather  than  design  to  budget  we  believe  that  top  level  designs  should  be  analysed  to 
obtain  an  estimate  of  their  likely  resource  needs.  These  estimates  should  form  the  basis  of  an 
initial  schedulability  analysis.  If  the  design  will  not  fit  then  extra  resources  must  be  found  or 
the  scope  of  the  design  reduced.  As  detailed  design  and  coding  is  undertaken  the 
schedulability  analysis  is  re-done.  Fixed  budgeting  and  early  allocation  (of  computing 
activities  to  rules  of  the  distributed  system)  reduces  the  flexibility  required  in  the  design 
process. 

In  order  to  get  sufficient  flexibility  into  the  allocation  process  it  must  be  possible  for  a 
design  to  be  mapped  on  to  the  distributed  system  in  many  difference  ways.  Id^ly  it  should 


not  be  necessary  to  artificially  split  coherent  objects  in  order  to  facilitate  distribution.  Rather 
the  design  should  articulate  a  populaticm  of  objects  that  can  be  combined  in  many  different 
ways. 

An  application  expressed  in  the  computational  model  described  earlier  can  easily  be 
distributed  as  the  active  objects  only  have  an  asynchrcHtous  relationship  with  one  another.  For 
example  a  cyclic  object  on  one  node  can  release  (and  pass  data  to)  a  sporadic  object  on  another 
node  by  using  a  distributed  implementation  of  the  appropriate  ntKi-active  object  or 
asynchrmous  message.  All  non-active  objects  can  be  distributed  without  affecting  their 
functionality.  Only  asynchronous  messages  are  needed  at  the  communication  layer.  The 
temporal  behaviour  of  the  distributed  system  will  change  (when  compared  with  the  non- 
distiibuted  implementation),  but  this  can  be  analysed  by  the  scheduling  m^el. 

2.13.  HRT-HOOD 

Within  a  projea  funded  by  the  European  Space  Agency  (ESA)  we  have  been  developing  a 
structured  design  method  for  hard  real-time  systems.  We  took  HOOD  (Hierarchical  Object 
Oriented  Design)^  as  a  basis  but  modified  the  method  in  line  with  the  framework  outlined 
above.  In  HRT-HOOD  (Hard  Real-Time  HOOD)  there  are  five  object  types:  PASSIVE, 
ACTIVE,  PROTECTED,  CYCLIC  and  SPORADIC.  The  CYCLIC  objects  execute 
periodically;  the  SPORADIC  ones  have  a  minimum  inter-arrival  rate;  the  PASSIVE  objects 
offer  no  protection  on  the  data  they  encapsulate  whereas  the  PROTECTED  objects  can  offer 
protection  (but  are  non-active).  ACTIVE  objects  have  no  inherent  (defined)  structure.  In  a 
real-time  system  they  must  decompose  into  terminal  objects  of  the  other  four  types. 

There  are  strict  rules  as  to  how  object  types  may  decompose  and  use  the  interfaces  of 
other  otqect  types.  The  culmination  of  the  logical  design  activity  is  a  set  of  terminal  objects 
that  are  well  defined  (ie  not  ACTIVE).  All  CTCUC  and  SPORADIC  objects  intcraa  via 
PASSIVE  or  PROTECTED  objects,  or  via  explicit  asynchronous  message  passing  (ie  all 
communication  is  asynchronous).  For  example  a  CYCLIC  object  may  asynchronously  release 
a  SPORADIC  object.  If  an  immediate  response  from  an  objea  is  required  then  an 
asynchronous  message  can  trigger  a  transfer  of  control  operation  in  a  CYCLIC  or  SPORADIC 
object  (if  such  an  operation  is  defined  in  the  object’s  int^ace). 

The  timing  requirements  of  the  design  are  represented  as  attributes  of  the  objects. 

Mapping  from  HRT-H(X)D  to  Ada  9X  and  *0  Ada  83  (with  task  optimisation)  have  been 
undertaken^’ Preemptive  priority  based  dispatching  is  used  and  PROTECTED  objects 
employ  ceiling  priorities  to  obtain  their  protection.  Each  CYCLIC  and  SPORADIC  object 
contains  a  single  thread  (task).  A  case  study  implementation  of  a  single  processor  system  has 
been  undertaken^.  This  involved  re-engineering  the  Olympus  satellite’s  AOBS  (Attitude  and 
Orbital  Control  System).  The  system  was  designed  in  HRT_H(X)D  and  implemented  using  a 
modified  version  of  Ada83  (to  reflect  Ada9X  facilities). 

HRT-HOOD  is  not  presented  as  the  definitive  means  of  designing  real-time  systems.  But 
it  has  been  defined  to  directly  address  the  list  of  issues  given  earlier.  We  believe  these  issues 
are  crucially  important 


2J,.  Designing  for  Parallel  Architectures 

Whereas  the  level  of  concurrency  in  a  distributed  system  is  often  explicit,  the  exploitation  of 
parallel  hardware  requires  an  implicit  approach.  This  is  because  the  task  of  designing  software 
that  can  execute,  efficiently,  with  different  levels  of  parallelism  (including  none)  is 
exceedingly  problematic.  The  design  process  therefore  needs: 

•  Programming  abstractions  that  can  reduce  the  burden  of  constructing  parallel  programs. 

•  Operating  system  (kernel)  that  give  support  for  "infimte"  parallelism. 

•  Operating  system  (kernel)  support  for  d3mamic  recaifiguration  following  the  loss  of 
computing  elements. 

All  of  these  activities  need  also  to  be  coordinated  with  an  effective  approach  to  real-time. 

One  possible  approach  here  is  to  allow  an  object  to  define  its  qieration  in  such  a  way  as 
to  allow  a  variable  number  of  parallel  threads  to  be  created  by  the  kernel.  A  minimum  number 
would  however  have  to  be  guaranteed  for  the  worst  case  response  time  to  be  calculated. 

3.  DESIGN  METHODS  AND  SCHEDULING  THEORY 
As  indicated  above  the  three  key  issues  are: 

(a)  The  rules  of  decomposition  allowed  in  the  design  method  must  not  compromise  the  need 
to  analyse  the  complete  design. 

(b)  Abstractions  used  in  real-time  systems  (e.g.  cyclic  activities,  deadlines,  response  times) 
must  be  supported  directly  in  the  design  method. 

(c)  Scheduling  analysis  must  be  ^plied  as  early  as  possible  (to  an  incomplete  design). 

To  balance  flexibility,  predictability  and  efficiency,  preemptive  (or  deferred  preemptive) 
dispatching  of  priority  assigned  threads  is  recommendoL  Priorities  being,  essentially,  static. 
The  use  of  cyclic  executives  is  considered  to  be  too  restrictive;  the  paper  by  Locke*®  argues 
this  case  convincingly.  Rate  monotonic  scheduling  analysis  (RMSA)  has  been  very  successful 
in  showing  how  an  engineering  tqiproach  can  be  applied  to  give  predictions  for  the  timing 
behaviour  of  applicaticni  using  priority  based  run-time  dispatching.  The  standard  equations  in 
RMSA  are  exceedingly  simple  and  use  only  a  measure  of  processor  utilisation  to  give 
predictions.  More  comprehensive  analysis  can  be  achieved  by  considering  the  worst  case 
response  times  (WCRT)  of  the  threads  incorporated  in  the  design.  Recent  analysis  has  used 
this  approach  to  give  predictable  WCHT  in  the  presence  of: 

•  Context  switches 

•  Release  jitter 

•  Qock  interruptions  with  marupulations  of  delay  queues 

•  Threads  with  tight  requirements  for  I/O  jitter  control 

•  Mode  changes 

•  Sporadic  threads  that  have  deadlines  unrelated  to  their  worst  case  arrival  rate 

•  Sporadic  threads  that  arrive  in  bursts 

•  Deadlines  less  than,  or  greater  than,  period  (for  cyclic  threads) 

•  Threads  with  more  than  one  deadline 

•  Threads  with  offset  relations  (with  respect  to  one  another) 


There  remains  work  to  be  done  with  multi«processor  (parallel)  nodes  (i.e.  to  make  sure 
that  this  analysis  is  applicable  to  a  group  of  parallel  processors  sharing  the  same  set  of 
threads). 

The  scheduling  of  the  communicadons  media  is  more  problemadc.  TDMA  is  very 
inflexible;  dynamic  collision  based  ethemet  protocols  although  theoiedcally  predictable^ 
cannot  yet  be  recommended.  A  software  token  bus  protocol  would  seem  the  best  compromise. 
It  can  be  analysed  and  can  be  integrated  with  the  allocadon  and  node  scheduling  aedvides.  If 
fault-tolerance  is  also  to  be  addressed  then  some  form  of  atomic  broadcast  will  be  needed. 
This  will  also  have  ccmsequences  for  scheduling. 

It  seems  increasingly  likely  that  DSGM  (distributed  shared  global  memory)  will  play  an 
increasingly  important  role  in  distributed  real-dme  systems.  The  performance  and  applicability 
of  this  technology  makes  it  a  genuine  alternative  to  LAN  based  approaches.  Although,  in  some 
ways,  such  memories  can  be  simply  modelled  as  slow  ordinary  memory  this  may  not  be  an 
adequate  model.  More  reahstic  scheduling  analysis  may  be  needed. 

It  is  well  known  that  predictions  based  on  a  worst  case  scenario  are  pessimistic  due  to 
sporadic  aedvides  not  occurring  as  often  as  worst  case  and  hardware  performing  better  than 
can  be  relied  upon  (due  to  cache  and  pipelining).  The  incorporation  of  "unbounded" 
components  into  a  design  (so  that  they  can  make  use  of  spare  capacity)  is  currently  an  active 
resrarch  topic.  Once  paradigms  are  developed  then  the  design  methods  must  allow  these  new 
abstractions  to  be  used  directly.  We  anticipate  new  objea  types  being  added  to  HRT-HOOD. 

Different  scheduling  regimes  will  impose  different  restrictions  on  the  design  process. 
But  it  must  be  the  case  that  the  needs  of  the  scheduling  theory  dictate  what  the  properties  of 
the  design  method  should  be. 

To  facilitate  flexible  (static)  allocation  of  objects  to  nodes  it  is  important  that  objects  in 
the  design  method  are  not  closely  coupled.  The  use  of  asynchronous  thread  interaction  docs 
give  considerable  freedom  to  the  allocation  process.  But  for  large  systems  the  complexity  of 
the  allocation  and  conflguration  activity  remains  significant  We  have  had  some  success  at 
applying  simulated  annealing  to  the  allocation  activity^^.  Qcarly  tool  support  is  needed. 

4.  VALroATION 

Formally  proving  large  concurrent  systems  is  beyond  current  engineering  practice.  Testing  of 
highly  dependable  software  cannot  provide  the  reliability  levels  needed.  It  is  therefore  clear 
that  the  design  method  (and  process)  must  itself  support  validation.  The  following  points  will 
help  bring  this  about; 

(a)  Use  formal  methods  to  prove  the  sequential  behaviour  of  objects. 

(b)  Use  asynchronous  interactions  between  threads. 

(c)  Use  scheduling  analysis  to  prove  that  the  end-to-end  timing  requirements  are  met. 

(d)  Use  proven  (certified)  run-time  kernels. 

The  use  of  preemptive  priority  based  scheduling  should  allow  kernels  to  be  certified. 


5.  PROMISING  AREAS  OF  RESEARCH 

It  would  seem  unlikely  that  funding  research  in  design  methods  themselves  would  be  effecdve. 
Methods  need  good  tool  suppon  and  a  user  population  to  evaluate  them.  Rather  it  will  be 
more  cost  effective  to  focus  on  particular  aspects  of  the  problem.  We  would  give  top  priority 
to  consideration  of  how  an  object  (using  the  term  to  imply  any  module  structure)  interface  and 
specification  can  be  analysed  to  give  a  meaningful  predicticMi  of  the  resource  needs  of  that 
object  when  it  is  iiiq>lemented  in  software.  Qearly  this  issue  is  linked  to  that  of  reusability 
and  cLssificauon;  it  will  also  inply  that  the  notafion  used  to  specify  the  interface  and 
specification  of  the  object  will  need  particular  expressive  power. 

Much  of  the  necessary  scheduling  theory  is  in  place  although  the  ramificaticHis  of  parallel 
architectures  requires  further  study,  and  the  scheduling  of  the  communication  media  remains 
an  open  issue.  But  exemplar  systems  are  still  rare.  The  funding  of  case  smdies  (i.e. 
implementations),  and  where  appropriate  tools,  will  help  demonstrate  the  power  of  these 
techniques. 

6.  OTHER  ISSUES 

It  is  often  the  case  that  the  requirement  for  fault  tolerance  is  added  to  the  list:  large, 
distributed,  parallel  and  real-time.  If  this  is  the  case  then  a  number  of  other  issues  need  to  be 
considered: 

•  The  degree  of  parallelism  may  decrease  over  time  (for  long  life  non-stop  systems).  The 
application  programmer  should  not  have  to  program  this  reconfiguration. 

•  Gommunications  must  be  replicated;  either  by  acknowledgement  and  rebroadcast  (if 
acknowledgement  times  out),  or  by  diffusion  (sending  the  message  more  than  once  to 
Stan  with). 

•  Processing  elements  may  need  to  be  replicated.  Within  a  node  is  easier  but  does  not  give 
the  same  fault  coverage  as  replication  between  nodes.  The  latter,  however,  requires  some 
form  of  atomic  broadcast  on  the  conununication  media. 

Although  all  these  issues  have  been  addressed  in  nOT-real-time  systems,  the  added  requirement 
for  timely  performance  significantly  complicates  the  problems  involved. 

7.  CONCLUSIONS 

The  needs  for  flexible  allocation  (configuration)  and  predictable  performance  necessitate  the 
use  of  an  asynchronous  computational  model.  On  top  of  this  the  design  method  shtwild  provide 
object  types  that  correspond  to  the  abstraction  found  in  real-time  applications.  We,  in  HRT- 
HCX)D,  have  used  two  forms  on  active  object  (CYCLIC  and  SPORADIC)  and  two  forms  of 
non-active  object  (PASSIVE  and  PROTECTED).  All  communications  between  active  objects 
can  be  remote  without  effecting  the  functionality  of  the  code. 

Preemptive  priority  based  scheduling  (on  the  multiprocessor  nodes),  with  an  equivalent 
behaviour  on  the  communication  media,  gives  an  appropriate  level  of  flexibility  for  the 
allocation  process.  It  also  facilitates  analysis  of  the  timing  behaviour  of  the  application. 
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The  design  of  complex  real-time  systems  which  may  involve  parallel  architectures  is  a  difficult  task 
with  no  clear-cut  solutions.  The  scientific  analysis  of  the  critical  issues  in  this  field,  however,  can 
only  be  realized  by  the  availability  of  a  formal  framework  for  specification  of  system  behavior  and 
requirements,  that  abstracts  irrelevant  details  and  permits  formal  analysis.  In  particular,  an  executable 
specification  can  be  considered  as  a  “causal  model"  of  a  system  that  can  be  used  for  verification,  as  well 
as.  for  testing,  fault  diagnosis,  analysis  of  scheduling  requirements  and  performance  evaluation. 
Analysis  can  reduce  significantly  the  time  and  cost  of  evaluating  complex  parallel  systems  compared  to 
the  purely  experimental  approaches.  Experimental  efforts  are  still  required,  however,  to  validate  the 
results  of  the  formal  analysis  and  to  help  determine  the  most  promising  research  issues.  Since 
requirements  on  systems  can  vary  enormously,  no  single  design  methodology  is  guaranteed  to  be 
optimal.  A  formal  framework  in  which  both  requirements  and  system  design  can  be  specified  precisely 
is  the  key  to  reducing  the  development  cost  and  verifying  the  adequacy  of  complex  designs.  The  goal 
for  optimality  in  this  view  is  replaced  by  the  goal  of  satisfying  a  set  of  pre-specified  requirements.  An 
overview  of  a  particular  specification  method  that  meets  many  of  these  requirements  is  presented  in  this 
paper. 

1  Introduction 

In  this  paper,  we  address  the  following  four  questions  raised  in  the  Call  for  Papers: 

1 .  What  is  the  best  method  or  methodology  for  designing  large,  distributed  real-time  systems, 
where  processing  elements  may  have  a  parallel  architecture? 

2.  What  should  be  the  relationship  between  real-time  Design  Theory  and  real-time  Scheduling 
Theory  in  a  design  methodology  for  this  class  of  systems? 

3 .  What  is  the  best  method  for  validating  that  large,  distributed,  parallel  architecture  real-time 
systems  behave  as  specified? 

4.  Given  that  resources  were  available  to  enhance  the  design  and  testing  methodologies  for  this 
class  of  systems,  what  are  the  most  promising  areas  where  these  resources  could  be  applied? 

These  questions  can  be  addressed  from  various  perspectives.  Our  approach  is  to  explore  the  role 
that  “formal  methods”  can  play  in  the  design  of  complex  and  safety-critical  real-time  systems.  As 
indicated  in  [LL92],  one  can  no  longer  rely  on  “engineering  judgment”  to  assure  safety  and  other 
propenies  of  a  design.  Also,  as  predicted  in  [CK91].  in  the  future,  “programmers,  tired  of 
debugging  difficulties,  will  use  design  methods  that  produce  correct  programs  as  well  as  formal 
methods  of  proving  correctness  and  methods  of  integrating  proofs  and  testing.”  Our  goal  is  to 
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indicate  how  formal  methods  can  contribute  to  the  establishment  of  a  scientific  basis  for  designing 
complex  real-time  systems.  In  particular,  our  contention  is  that  to  be  able  to  predict  whether  a 
design  is  adequate,  one  must  have  specifications  of  the  problem  and  the  design.  These 
specifications  must  be  formal  enough  to  permit  analysis  of  the  properties  of  interest. 

A  formal  method  is  defined  here  as  “a  formalism  for  specifying  the  propenics  of  a  system,  in  a 
way  that  formal  analysis  can  be  performed  on  it.”  We  adopt  a  somewhat  liberal  definition  for  the 
term  “formal”  in  this  paper.  However,  we  do  not  consider  simulation,  by  itself,  as  a  sufficient 
criterion  forjudging  a  specification  formalism  to  be  formal.  On  the  other  hand,  not  all  formal 
methods  provide  a  simulation  capability  (or  execuiability).  The  most  promising  methods  are  those 
that  are  Ixjth  executable  and  also  offer  a  framework  for  mathematical  analysis  of  behavioral  or 
functional  properties. 

The  potential  benefits  of  formal  methods  are  in  (1)  understanding  the  requirements  and  architectural 
implications  of  systems,  and  (2)  in  responding  to  the  first  three  basic  question  raised  above.  Our 
basic  tenet  is  that  by  using  a  formal  framework,  one  can  perform  the  kinds  of  analysis  that  are 
necessary  to  evaluate  a  design  and,  at  the  same  time,  address  performance  issues  regardless  of  the 
types  of  architectures  employed.  Considering  the  difficulties  often  encountered  in  performance- 
oriented  experimentation  on  actual  parallel  architectures  [H91].  the  use  of  formal  methods  can 
reduce  the  cost  of  designing  systems  that  must  meet  hard  deadlines  and  strict  performance 
requirements.  Experimental  work  is  still  required,  however,  to  support  the  formal  analysis, 
particularly  in  validating  assumption  relating  to  communication  delays  and  execution  times  of 
individual  processes. 

In  the  next  section,  we  review  the  basic  concepts  of  formal  methods  for  real-time  systems.  In 
particular,  we  present  a  brief  outline  of  the  “hierarchical  multi-state  (HMS)  machine”  approach  to 
specification  and  verification  of  systems  that  may  involve  parallel  architectures.  In  Section  3,  we 
discuss  the  benefits  of  formal  methods  in  addressing  questions  relating  to  validation,  performance, 
scheduling,  etc.  Finally,  in  Section  4,  we  present  some  conclusions  and  our  recommendations  for 
the  most  promising  areas  that  require  further  expenditure  of  resources. 

2  Overview  of  Formal  Methods  for  Real-Time  Systems 

Numerous  fonnal  methods  for  non-real-time  systems  have  been  reponed  in  the  literature.  The 
International  Standards  Organization  (ISO)  has  accepted  SDL,  Estelle,  and  LOTOS  as  “standard 
formal  description  techniques.”  Process  algebra,  Z  and  Petri  nets  are  some  of  the  other  techniques 
that  have  achieved  some  prominence,  particularly  in  Europe. 

For  real-time  systems,  the  main  drawback  of  these  methods  is  the  absence  of  a  natural 
representation  for  time.  Numerous  extensions,  however,  have  been  proposed  that  address  this 
issue.  For  example.  [NS92]  deals  with  the  temporal  extension  of  process  algebra  and  various 
timed  extensions  of  Petri  nets  have  appeared.  Another  major  trend  has  been  to  modify  temporal 
logic  to  deal  with  hard  deadlines  or  to  add  delays  in  finite-state  machine  models  of  processes.  A 
temporal  logic  perspective  on  these  issues  can  be  found  in  [MP92]. 

The  principal  problem  with  most  of  the  formal  methods  for  real-time  systems  is  the  inability  to  deal 
with  complexity.  For  example,  sute  proliferation  makes  the  use  of  finite-state  machine  models 
difficult,  except  for  relatively  simple  cases.  The  use  of  multiple  communication  finite-state 
machines  limits  this  problem  to  some  extent,  but  not  sufficiently.  The  most  promising  state 
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representations  that  address  the  complexity  issue  are  hierarchical  “multi-state”  models,  in  which 
multiple  states  may  be  active  at  a  particular  level  of  hierarchy.  One  example  is  Siatecharts  fHa87] 
which  is  used  mainly  for  simulation;  another  is  the  HMS  machine  model  to  be  discussed  later  in 
this  section  which  was  originally  introduced  in  a  simpler  form  in  [GS87]. 

A  combination  of  state  modeling  within  a  temporal  framework  seems  to  be  the  most  promising  way 
of  specifying  real-time  systems.  The  standard  way  of  accomplishing  this,  even  for  hierarchical 
state  models,  has  been  to  add  delays  on  transitions.  This  is  a  highly  limited  view  of  time  since  the 
language  of  delays  is  simply  inadequate  for  dealing  with  complex  temporal  relationships.  In 
contrast,  pure  temporal  logic,  by  itself,  is  not  capable  of  expressing  all  “regular  propenies”  that  are 
definable  in  terms  of  state  models. 

A  more  fundamental  problem  with  all  standard  specification  languages  for  real-time  systems  is  the 
almost  exclusive  use  of  “future  time”  as  a  basis  for  defining  behavior.  Thus,  in  many  formalisms, 
individual  action  are  defined  in  the  following  form:  “if  the  system  is  in  state  s,  then  if  the  event  a 
occurs  at  time  t,  the  system  will  move  to  state  s"  sometime  between  t+tj  and  t+t2."  The  inherent 
assumption  here  is  that  the  world  is  deterministic  and  the  future  is  predictable.  In  point  of  fact,  one 
of  the  basic  characteristics  of  a  real-time  system  is  that  its  inputs  are  unpredictable.  To  build  a 
“causal  model”  of  a  system  for  analysis,  the  only  reasonable  assumption  is  to  define  what  will 
happen  now  or,  in  case  of  a  discrete  model  of  time,  what  will  happen  at  the  next  moment. 
Consequently,  it  appears  more  natural  to  use  a  “past  time”  temporal  language  to  define  a  system’s 
behavior  in  terms  of  its  history. 

The  “hierarchical  multi-state  (HMS)  machine”  methodology  [GF88,Ga91a,GI92]  combines  state 
modeling  approach  with  an  interval-based  temporal  logic  to  give  a  comprehensive  behavioral  model 
of  complex  real-time  systems  that  (1)  reduces  state  complexity  compared  to  traditional  state 
models,  (2)  provides  a  rich  language  for  expressing  complex  temporal  relationships,  (3)  avoids 
assumptions  of  determinism  of  future  events,  and  (4)  provides  executability,  as  well  as  the  ability 
to  verify  properties  such  as  safety  and  schedulability.  An  expressive  graphic  notation  makes  the 
formalism  accessible  to  a  larger  audience  than  many  other  formal  methods  and  a  simple  extension 
makes  possible  the  specification  of  unbounded  parallelism  under  temporal  constraints. 

Informally,  an  HMS  machine  H  is  a  triple  (S,  Fd.  F^K  where  S  is  a  set  of  “states,”  fb  is  a  set  of 
“deterministic  transitions,”  and  Fs  is  a  set  of  “nondetermini Stic  transitions.”  Each  state  in  S  is 
either  a  “primitive  state”  or  is  an  HMS  machine  itself  that  may  be  equivalent  to  H,  in  the  latter  case 
giving  rise  to  a  recursive  hierarchy.  Each  primitive  state  is  either  TRUE  or  FALSE,  thus  extending 
the  standard  notion  of  state  in  systems  theory  to  a  system  with  possibly  multiple  true  states.  Each 
transition  in  Fp  or  is  a  mapping  from  one  subset  of  states  of  S  (the  “primaries”)  to  another 
subset  of  S  (the  “consequents”).  The  “firing”  of  a  transition  at  a  moment  of  time  causes  its 
consequent  states  to  become  TRUE,  while  each  primary  state  becomes  false  unless  it  is  the 
consequent  of  a  transition  that  fires  simultaneously.  In  contrast  to  traditional  automata  theory,  the 
enablement  condition  of  a  transition  is  defined  not  by  messages  but  in  terms  of  an  interval-based 
temporal  logic,  called  TIL.  The  logic  TIL  extends  propositional  logic  by  the  addition  of  the  several 
new  operators.  The  main  operators  are:  (1)  0(t)  =  at  time  t,  (2)  Itj,  t2l  =  always  from  //  to  r;.  and 
(3)  <ti,  t2>  =  sometime  from  //  to  t2. 

The  behavior  of  a  real-time  system  can  be  defined  in  terms  of  an  HMS  machine  by  first  defining  its 
attributes  hierarchically  in  terms  of  states.  Secondly,  the  changes  in  its  states  are  characterized  by 
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transitions.  Thirdly,  the  conditions  under  which  changes  occur  are  specified  by  TIL  predicates. 
Consider  a  process  Proc-A  that  executes  for  a  minimum  of  5  seconds  as  long  as  an  error  flag  is  not 
raised  and  abons  immediately  if  the  error  flag  is  raised.  This  would  be  defined  in  terms  two 
transitions  out  of  the  state  Proc-A  as  in  Figure  1.  States  are  denoted  by  boxes,  transitions  by  dark 
arrows  and  predicates  by  thin  arrows  using  VLSI  symbols  for  boolean  operators  and  an  encircled 
T  for  temporal  operators.  The  vertical  transition  (representing  normal  termination)  is 
nondeterministic  (labeled  with  an  asterisk)  and  has  the  TIL  predicate  l-5,0]Proc-A  a  — >  ERROR. 
The  hoiizontal  transition  (representing  abnormal  termination)  is  deterministic  and  has  the  predicate 
ERROR.  At  any  moment  of  time,  one  can  evaluate  the  predicates  to  determine  if  either  transition 
must  or  may  fire.  The  predicates  essentially  define  the  causes  or  permissions  for  the  transitions  to 
fire.  For  example,  if  Proc-A  has  been  active  continuously  for  at  least  5  units  of  time,  then  the 
predicate  [-5,0} Proc-A  will  be  true  and  the  process  Proc-A  may  terminate  if  the  state  ERROR  is  not 
TRUE.  Note  that  f-S.OJProc-A  will  remain  true  if  the  transition  does  not  fire  immediately  and  no 
deterministic  assumptions  about  the  future  are  necessary.  On  the  other  hand,  whenever  the  state 
ERilOR  becomes  TRUE,  the  system  aborts  immediately.  Under  most  traditional  lepresenutions, 
one  would  have  to  retract  the  original  statement  that  the  process  A  is  to  terminate  sometime  after  5 
units  of  time  if  the  system  aborts.  In  our  notation  no  such  retraction  is  required. 


Figure  1.  Specification  of  a  Simple  Real-Time  Process 

The  key  questions  to  be  raised  about  any  formal  notation  in  the  context  of  this  workshop  are:  (1) 
Can  it  deal  with  complexity?  (2)  How  does  it  address  parallelism?  (3)  What  benefits  does  it 
provide?  The  third  question  is  addressed  in  the  next  section.  We  now  turn  to  the  first  two 
questions. 

To  deal  with  complexity,  abstraction  mechanisms  are  needed  that  aUow  one  to  express  and  analyze 
relationships  at  higher  levels  without  concern  with  lower-level  details.  Hierarchical  decomposition 
is  one  common  abstraction  mechanism  that  many  methods,  including  HMS  machines,  employ. 
Another  mechanism  in  case  of  HMS  machines  is  the  use  of  multiple  (partial)  states  to  represent  the 
traditional  concept  of  the  state  of  a  system.  This  can  potentially  reduce  the  number  of  states  of  a 
systems  logarithmically.  For  example,  2^  states  in  a  finite-state  machine  may  be  required  to 
specify  a  system  described  in  terms  of  N  states  in  an  HMS  machine.  A  third  abstraction 
mechanism  for  HMS  machines  is  nondeterminism.  In  traditional  modeling  methods  such  as  Petri 
nets  and  finite-state  machines,  nondeterminism  arises  from  structural  considerations.  For  example, 
if  two  transitions  from  a  state  are  triggered  by  the  same  signal,  then  the  choice  is  made 
nondeterministically.  In  contrast,  in  an  HMS  machine,  a  transition  is  explicitly  designated  to  be 


dcierministic  or  nondcterministic.  Nondeterminism  in  this  form  can  be  used  to  capture  temporal 
uncertainty,  specially  early  in  the  system  design  stage.  It  also  serves  as  the  key  in  relating 
specification  to  scheduling  as  discussed  in  the  next  section.  Finally,  a  fourth  abstraction  concept 
for  HMS  machines  is  “multi-level  specification,”  which  was  first  introduced  in  {GF91].  In  a 
multi-level  specification  of  a  real-time  system,  a  hierarchy  of  specifications  jointly  describe  the 
behavior  of  a  system.  The  lowest  levels  normally  consist  of  nondeterministic  specifications  that 
describe  a  whole  class  of  behaviors.  Upper-level  specifications  employ  slightly  more  complex 
versions  of  HMS  machines,  called  “policy  machines.”  which  describe  goal-oriented  behavior.  The 
complete  specification  is  obtained  by  finding  paths  or  “plans”  at  the  lowest  level  machines  that 
satisfy  higher-level  constraints.  As  a  result,  a  high  degree  of  reusability  and  modularity  of 
specifications  is  obtained.  In  an  experiment  in  applying  this  approach  to  the  specification  of  a 
distributed  component  of  a  command  and  control  system  [Ga91b],  a  significant  reduction  in  the 
aaual  specification  effort  was  obtained. 

As  far  as  addressing  parallelism  is  concerned,  most  state-based  approaches  do  not  provide  any 
convenient  facilities.  One  must  explicitly  model  the  parallel  processes  individually  and  the  degree 
of  parallelism  must  be  known  a  priori.  F^liminary  studies  have  shown  that  an  extended  version  of 
HMS  machines  may  provide  a  very  general  approach  to  dealing  with  unbounded  parallelism.  In 
the  extended  version,  a  state  may  contain  an  arbitrary  number  of  “tokens.”  each  representing  one  of 
a  set  of  similar  processes  associated  with  the  state.  With  a  slight  extension  of  the  logic  TIL,  one 
can  then  obtain  a  powerful  formalism  for  specifying  and  reasoning  about  parallel  processes.  In 
fact,  this  approach  extends  the  “tagged-token”  model  of  data  flow  machines  proposed  in 
(AG82,De86]  by  introducing  conditionality  and  temporal  dependencies  in  data  flow  operations. 

3  Applications  of  Formal  Methods 

Given  an  executable  formal  specification  of  a  real-time  system,  various  types  of  analysis  can  be 
performed  on  it.  In  addition,  limited  success  has  been  achieved  in  using  a  specification  as  a  basis 
for  synthesis  of  an  actual  implementation.  For  software,  a  common  approach  to  synthesis  has 
been  “correctness-preserving  transformations”  that  lead  a  specification  gradually  to  a  program  in  a 
desired  language,  the  execution  of  which  satisfies  the  requirements  indicated  in  the  specification. 
For  a  hardware  subsystem,  a  specification  may  be  transformed,  after  a  set  of  transformations,  into 
a  hardware  description  language  such  as  VHDL,  from  which  gate-level  logic  diagrams  may  be 
constructed.  In  the  remainder  of  this  section,  we  will  address  the  applications  of  formal  methods 
in  analysis  of  specifications. 

Since  an  actual  implementation  of  a  system  can  be  a  costly  and  time-consuming  process,  an 
executable  specification  can  be  valuable  in  permitting  simulation-based  analysis  to  determine 
whether  performance  related  requirements  can  be  met.  Historically,  the  critical  problem  in 
simulation  has  been  the  absence  of  accurate  liming  information.  For  air  defense,  air  tnrffic  control 
and  other  application  systems,  where  historical  information  is  available  from  similar  previous 
systems,  the  problem  is  not  too  serious.  For  completely  new  systems  such  as  SDI  and  new 
computer  architectures  or  operating  systems,  the  problem  is  much  more  acute. 

An  important  application  of  a  formal  method  is  in  its  potential  ability  to  verify  formally  properties 
of  a  system  before  it  is  actually  built.  For  hard  real-time  systems,  most  of  the  interesting 
behavioral  attributes  can  be  grouped  under  “safety  propenies.”  A  safety  property  is  an  invariant  of 
a  system  expressible  in  the  language  of  temporal  logic  as  Dp  {always  p),  where  p  is  a  "past” 
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temporal  logic  predicate.  Given  an  HMS  machine  specification  of  a  system  and  such  a  safety 
property,  one  can  always  create  a  new  “system  failure  (SF)"  state  such  that  the  state  SF  becomes 
true  if  and  only  the  safety  property  is  violated.  Figure  2  depicts  an  example,  where  the  safety 
property  is  the  deadline  condition  "always  B  within  20  time  units  of  A."  The  vertical  bar  in  the 
figure  represents  a  state  that  always  has  the  value  TRUE  and  the  transition  fires  if  A  was  true  20 
time  units  ago,  while  B  has  been  FALSE  for  the  last  20  time  units. 


Figure  2.  Representation  of  a  Deadline  Safety  Property 

Assuming  that  the  states  A  and  B  are  part  of  a  larger  specification,  to  verify  the  correctness  of  the 
specification  with  respect  to  the  safety  propeny,  it  is  sufficient  to  demonstrate  that  the  state  SF  is 
unreachable.  In  [GI92],  a  refutation-based  theorem  proving  approach  was  presented  for  verifying 
such  safety  properties  which  is  complete  in  the  following  sense:  Given  a  safety  propeny.  if  it  is 
indeed  satisfied  by  the  specification,  there  exists  a  proof  for  it.  The  proof  begins  with  the 
assumption  that  the  safety  propeny  is  violated  and  reasons  backwards  to  demonstrate  that  the 
assumption  leads  to  a  contradiction  in  aU  possible  realizations.  The  advantage  of  the  method  is  that 
a  complete  enumeration  of  behavior  is  not  necessary,  in  general.  Thus,  in  principle,  the  proof  of  a 
simple  propeny  for  even  a  very  complex  system  could  be  quite  simple.  Theorem  proving  can  also 
be  performed  in  a  forward  reasoning  form,  once  again,  without  the  need  for  complete  analysis  of 
behavior.  On  the  other  hand,  in  an  enumeration-based  technique,  even  the  verification  of  a  simple 
propeny  may  require  the  creation  of  a  complex  computation  graph.  For  certain  types  of  properties, 
however,  the  enumeration-based  methods  are  more  suitable  than  theorem  proving  methods. 

A  specification  can  also  be  used  for  analysis  of  scheduling  requirements  of  processes.  In  [GF91], 
a  general  method  for  deriving  schedules  for  concurrent  process  from  specifications  was  presented. 
In  this  scheme,  one  begins  from  the  consideration  of  a  nondeterministic  HMS  machine 
specification  of  a  real-time  system.  Given  a  partially-ordered  set  of  processes  that  must  be 
executed,  one  can  then  derive  a  set  of  mathematical  inequalities  that  must  be  solved  in  order  to 
satisf\  the  local  logical  and  temporal  constraints.  Such  an  approach  can  also  be  used  for 
verification  by  demonstrating  the  infeasibility  of  schedules  for  reaching  system  failure  states  of  the 
type  indicated  in  Figure  2. 

Finally,  a  formal  specification  may  be  used  (1)  to  develop  requirement-based  test  cases  for 
evaluation  of  an  implementation  and  (2)  to  perform  diagnosis  of  causes  of  errors.  For  testing,  a 
specification  of  the  environment  in  which  a  systems  is  to  operate  will  normally  be  required,  while, 
for  diagnosis,  a  backward  reasoning  process  much  like  refutation-based  verification  can  be  used. 
No  significantly  new  specification  features  arc  expected  to  be  necessary  in  either  case,  although 
analytic  and  heuristic  techniques  are  still  needed  to  realize  these  goals  in  a  realistic  setting. 


4  Discussion  and  Conclusions 

For  three  reasons,  it  is  our  belief  that  the  search  for  the  best  design  methodology  must  be 
abandoned.  First,  the  requirements  of  systems  can  be  very  different.  Design  method  A  may  be  far 
superior  to  deign  method  B  for  one  application,  whereas  the  opposite  may  be  true  for  another 
application.  Secondly,  the  implementation  framework  (architecture,  operating  system, 
communication  facilities,  and  programming  language  constructs)  may  have  a  significant  bearing  on 
the  quality  of  a  design.  Thus,  for  example,  the  type  of  parallelism,  the  synchronization  constructs 
and  the  reliability  of  the  communication  systems  can  affect  the  performance  of  a  system 
significantly.  One  design  methodology  cannot  be  expected  to  be  satisfactory  in  all  cases.  Thirdly, 
new  methodologies  always  arise  that  improve  upon  existing  approaches.  Thus,  the  justification  for 
choosing  a  methodology  even  for  a  very  restricted  domain  may  be  short-lived. 

In  the  absence  of  a  clear  choice  for  a  design  methodology,  the  judicious  approach  seems  to  be 
encourage  the  development  of  tools  and  methods  that  make  possible  the  rapid  evaluation  of  a 
design  under  a  set  of  requirements.  Thus,  formal  methods  for  specification  and  analysis  of 
systems  that  can  deal  with  complexity,  real-time  issues,  performance  and  correctness  are 
recommended  areas  for  further  expenditure  of  resources.  The  search  for  optimality  of  design  must 
also  be  given  up  since  it  is  an  impossible  and  unrealistic  goal  for  large  systems.  The  goal  of  design 
must  be  limited  to  meeting  requirements  under  the  specified  constraints.  Optimality  is  often 
undefinable  and  unattainable. 

As  far  as  validation  of  large  systems  is  concerned,  simulation  and  formal  verification  are  the  most 
promising  approaches.  The  basis  for  both,  however,  should  be  a  specification  methodology  that 
can  lend  itself  to  formal  proof  of  correctness  for  critical  aspects,  can  deal  with  systems  at  various 
levels  of  abstraction,  and  can  avoid  the  abstruseness  that  is  characteristic  of  many  formal  methods. 
The  use  of  graphic  notation  is  recommended  to  deal  with  the  problem  of  accessibility  of  formal 
methods  to  a  wider  audience. 

While  there  exist  many  verification  techniques,  it  has  been  our  experience  that  no  single  method  is 
universally  applicable.  Thus,  a  variety  of  verification  methods  must  be  developed,  some  of  which 
may  be  heuristic  and  incomplete.  For  a  given  system,  the  best  choice  of  method  cannot  always  be 
predicted.  A  strategy  that  has  been  successful  in  a  different  context  has  been  to  attempt  several 
different  techniques  and  experimentally  determine  the  best  approach  for  a  given  problem.  In  our 
experience,  theorem  proving  cither  in  the  forward  or  reverse  direction  and  scheduling-based 
analysis  have  proven  to  be  the  most  promising  formal  verification  methods.  Also,  cenain 
enumeration  techniques  have  shown  great  promise  in  haroware  verification. 

The  use  of  a  formal  representation  of  a  system  can  also  provide  a  mechanism  to  investigate  the 
relationships  between  design  and  scheduling  theories.  As  stated  in  the  previous  section,  it  is 
possible  to  derive  scheduling  requirements  of  aperiodic  and  event-driven  processes  from  a 
specification.  There  exists  little  experience,  however,  in  actually  deriving  scheduling  strategies 
from  specifications  for  large  systems. 

Our  final  conclusions  can  be  summarized  as  follows: 

1 .  There  exists  no  best  design  methodology.  Different  methodologies  may  be  preferable  under 
different  requirements  and  implementation  corstraints.  Also,  rather  than  seeking  an  optimal 
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design,  the  aim  should  be  to  create  a  design  that  meets  requirements.  The  cost  associated 
with  seeking  an  optimal  design,  even  if  it  can  be  found,  may  not  be  woith  the  effort. 

2.  A  framework  is  necessary  to  evaluate  a  design  under  an  arbitrary  set  of  requirements  and 
constraints. 

3.  Formal  ru  thods  for  specification  that  can  deal  with  complexity  and  provide  capabilities  for 
simulauon,  verification,  automated  suppon  for  testing  and  analysis  of  temporal  properties  can 
offer  important  benefits  in  the  creation  of  large  and  distributed  real-time  systems. 

4 .  Experimental  analysis  is  needed  to  srpport  and  validate  the  results  of  formal  techniques. 

5.  The  most  efficient  use  of  resources  will  be  in  the  development  of  integrated  methodologies 
and  tools  for  rapid  evaluation  and  analysis  of  designs  of  complex  real-time  systems.  In 
particular,  formal  methods  can  provide  a  common  language  for  specification  of  requirements, 
sim-'iation,  verification,  and  automated  support  for  testing.  Small  experimental  laboratories 
are  also  necessary  to  validate  the  ^nnlyac  methods.  For  the  experimental  work,  the  definition 
of  a  set  f'f  prototypical  examples  and  benchmarks  will  be  a  necessary  ingredient.  Relatively 
modest  expenditure  of  resources  will  be  required  if  this  approach  is  pursued,  yet  it  could 
have  a  major  impact  o  ’  ihe  creation  of  systems  that  utilize  parallel  and  distributed 
architectures  and  meet  opeiadonal  requirement  in  a  cost-effective  manner. 
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1  Introduction 

The  realization  of  a  real-time  system  can  often  be  an  ad  hoc  process  of  experimentation.  Many 
factors  conspire  to  make  this  the  case,  among  which  are  inflexible  scheduling  paradigms  and 
the  lack  of  high-level  programming  language  support.  Real-time  performance  is  subsequently 
achieved  by  manually  counting  instruction-cycle  times,  hand-optimizing  the  code,  and  experi¬ 
menting  with  various  orderings  of  operations  to  help  achieve  schedulability. 

This  problem  is  compounded  by  the  inherently  iterative  nature  of  system  design.  In  the 
first  step,  mission  goals  are  described  in  a  rather  informal  fashion,  and  typically  in  a  natural 
language.  These  informal  goals  are  clarified  in  a  requirements  specification,  and  subsequently 
expanded  into  a  system  specification.  This  refinement  process  gradually  proceeds  until  a  final, 
complete  implementation  is  constructed. 

This  design  approach  is  replete  with  potential  problems.  For  example,  consider  a  system 
design  D  and  its  successor  design  D'.  First,  errors  may  be  present  in  P,  and  then  carried  down 
to  the  more  concrete  design,  D'.  Second,  errors  may  be  introduced  in  D'  that  were  not  present 
in  D  and  that  may  in  turn  become  present  in  the  final  implementation. 

These  problems  are  especially  prevalent  in  distributed  real-time  systems,  where  at  each  re¬ 
finement  stage,  assumptions  must  be  made  about  deadlines,  scheduling  algorithms,  CPU  speeds, 
clock  drift,  resource  requirements,  etc.  Such  design  assumptions  rarely  hold  in  the  full  implemen¬ 
tation,  and  thus,  the  real-time  system  will  probably  not  meet  its  original  mission  requirements. 

In  theory,  this  process  of  stepwise  refinement  should  be  a  natural  one,  and  there  should  be  a 
disciplined  approach  for  detecting  errors  at  each  refinement  stage.  In  practice,  however,  no  such 
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disuplined  approach  exists  for  designing  large,  real-time  system.  There  is  often  a  large  schism 
between  the  design  and  eventual  implementation  of  the  system.  At  the  design  stage,  implicit 
assumptions  are  made  about  the  eventual  implementation;  for  example,  its  number  of  resources, 
execution  speeds,  etc. 

2  Formal  Design  and  Scheduling 

The  area  of  formal  methods  offers  several  potential  solutions  to  this  problem.  Yet  while  many 
formal  models  of  real-time  computation  have  been  developed  (1,  3,  7,  4,  10,  8,  13,  17,  19],  most 
treat  processes  abstractly,  quite  isolated  from  their  operating  environments.  This  is  where  the 
unrealistic  assumptions  are  made  about  the  system’s  eventual  execution  model.  Such  assump¬ 
tions  range  from  the  overtly  optimistic  {e.g.,  all  executions  are  instantaneous),  to  the  impractical 
{e.g.,  a  one-to-one  assignment  of  processes  to  processors)  to  the  bleakly  pessimistic  (e.j.,  all  in¬ 
terleavings  of  process  executions  are  possible).  These  assumptions  rarely  hold  in  practice,  and 
using  them  to  reason  about  a  real-time  system’s  temporal  properties  can  lead  to  incorrect  con¬ 
clusions. 

Also,  there  has  also  been  considerable  progress  in  developing  scheduling  algorithms  and 
analyzers  (9,  15,  16,  18,  20).  In  these  approaches,  the  underlying  computational  model  is  gen¬ 
erally  limited  to  simple  precedence  relations  between  processes  where,  for  the  most  part,  the 
effect  of  process  synchronization  is  ignored.  Since  complex  interactions  between  processes  are 
not  captured,  these  approaches  cannot  be  used  for  proofs  of  desirable  properties  other  than 
schedulability. 

We  believe  it  is  essential  to  unify  the  area  of  formal  methods  and  scheduling  theory.  We  have 
made  a  start  in  this  direction  with  our  development  of  the  CSR  specification  paradigm  [5,  6).  The 
computation  model  of  CSR  is  resource-based,  in  that  multiple  resources  execute  synchronously, 
while  processes  assigned  to  the  same  resource  are  interleaved.  Resource  contention  is  resolved 
by  a  process’  current  priority.  Using  this  model,  we  can  prove  basic  properties  of  the  system 
design  using  a  proof  system,  or  alternatively,  a  reachability  analyzer. 

However,  the  CSR  framework  is  rather  crude,  in  that  possesses  a  discrete- time  model,  and 
only  considers  the  CPU  as  a  resource.  Thus  it  is  becomes  cumbersome  to  reason  about  large- 
scale,  heterogeneous,  distributed  systems.  Certainly  more  effort  is  required  to  investigate  the 
potential  links  between  formal  design  and  schedulability  analysis.  The  high  cost  associated  with 
real-time  failures  mandates  that  various  design  alternatives  can  be  specified  and  analyzed  before 
implementation. 
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3  The  Problem  of  Programming  Languages 


Programming  language  support  is  needed  to  help  refine  a  design  into  an  implementation.  With¬ 
out  high-level  real-time  languages,  programmers  are  frequently  forced  to  use  assembly  language 
modules  for  some  of  the  key  components  of  their  systems.  Recently,  experimental  languages 
have  been  proposed  which  provide  first-class,  real-time  constructs  [11,  12,  14].  An  example  of 
such  a  construct  is  “within  10ms  do  B,”  where  the  block  of  code  “B”  must  be  executed  within 
10  milliseconds.  This  constraint  is,  in  turn,  conveyed  to  the  real-time  scheduler  as  a  directive. 

These  languages,  while  providing  a  convenient  framework  for  expressing  time  in  programs, 
have  done  little  to  ease  the  process  of  translating  a  real-time  specification  into  schedulable 
code.  Thus,  their  timing  constructs  have  not  been  adopted  in  any  production-level  programming 
languages. 

We  believe  the  reason  is  straightforward:  Language  constructs  such  as  “within  10ms  do  B” 
establish  constraints  on  blocks  of  code.  However,  “true”  real-time  properties  establish  constraints 
between  the  occurrences  of  events  [2,  10).  These  constraints  typically  arise  from  a  requirements 
specification,  or  from  a  detailed  analysis  of  the  application  environment.  While  language-based 
constraints  are  very  sensitive  to  a  program’s  execution  time,  specification-based  constraints 
must  be  maintained  regardless  of  the  platform’s  CPU  characteristics,  memory  cycle  times,  bus 
arbitration  delays,  etc. 

We  have  recently  taken  a  new  approach  to  this  problem,  and  our  objective  is  to  “bridge  the 
gap”  between  specification  languages  and  programming  languages.  Our  approach  is  to  treat 
a  real-time  program  as  (1)  an  event-based,  timing  specification,  which  represents  the  system’s 
real-time  requirements;  and  (2)  a  functional  implementation,  that  is,  the  system’s  code.  Instead 
of  constraining  blocks  of  code,  timing  constructs  establish  constraints  between  the  observable 
events  within  the  code.  As  an  example,  consider  the  following  specification  fragment,  which  is 
rendered  pictorially  in  Figure  1: 

(1)  The  motion-sensor  emits  obj .coords  on  port  p. 

(2)  Transformation  function  F  converts  obj_coords  into  Daxt.cind  for  controller. 

(3)  The  controller  receives  next.CBd  on  port  q. 

(4)  To  achieve  steady  state,  transmission  of  n«xt_cmd  is  made  no  earlier  than  3.5  ms  after 
receipt  of  obj  .coords. 

(5)  To  guarantee  response-time  threshold,  transmission  of  n«xt-cad  is  made  no  later  than  4.0 
ms  after  receipt  of  obj.coords. 
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Figure  1:  Event-Based  Specification  of  Sensor- Controller  System 

We  claim  that  the  following  program  fragments  should  realize  the  specification: 

/*  Program  A  •/  /♦  Program  B  */ 

do  do 

receive(p,obj.coord6);  { 

start  after  3.5  ms  finish  within  4.0  ms  receive(p,obj.coords); 

{  next-cmd  =  F(obj-coords); 

next.cmd  =  F(obj.coord8);  } 

send(q,nextjcmd);  start  after  3.5  ms  finish  within  4.0  ms 

}  send(q,next.cmd); 

The  “send”  and  “receive”  operations  are  the  system’s  only  observable  events.  The  “do” 
statement  establishes  timing  constraints  only  between  these  two  operations.  On  the  other  hand, 
the  local  statement  “nezt_cmd  «  F(obj .coords)”  is  only  constrained  by  the  program’s  natural 
control  and  data  dependencies. 

Armed  with  this  interpretation,  we  consider  both  programs  as  having  equivalent  semantics! 
This  is  quite  different  from  the  approaches  mentioned  above,  where  timing  constructs  establish 
constraints  on  code.  In  that  interpretation,  program  A  would  first  receive  its  data,  then  delay 
for  3.5  ms  and  finally,  evaluate  F  and  send  the  result  within  the  remaining  0.5  ms.  Program  B 
would  receive  its  data,  evaluate  F,  then  delay  for  3.5  ms  and  finally,  send  the  result  tpithin  ^.0 
ms  of  evaluating  F  ! 

Both  programs  may  fail  to  implement  the  specification  under  the  code-based  constraints.  If 
F  is  a  CPU-intensive  function  (and  thereby  requires  over  0.5  ms  of  execution  time),  program 
A  is  inherently  unschedulable.  On  the  other  hand,  program  B  establishes  a  constraint  between 
the  evaluation  of  F  and  the  send  operation,  and  not  between  the  two  specified  events.  Both 
programs  would  have  to  be  rewritten  to  achieve  the  desired  effect.  The  necessary  corrections 
would  include  manually  decomposing  F,  as  well  as  adjusting  the  timing  constraints.  The  actual 
changes  would  heavily  depend  on  the  particular  characteristics  of  the  computer,  and  thus,  the 
very  reason  for  using  high-level  timing  constructs  would  be  defeated. 
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There  are  several  immediate  benefits  to  this  semantics  for  real-time  constructs.  First,  a 
source  program  is  not  hardware-specific,  and  thus  maintains  the  abstract,  “portable”  spirit  of  a 
high-level  language.  Since  the  timing  constraints  refer  only  to  specification-based  events,  they 
need  not  be  hand-tuned  for  an  individual  CPU.  Second,  this  decoupling  of  timing  constraints 
from  code  blocks  enables  a  more  straightforward  implementation  of  an  event-based  specification. 

Also,  much  of  the  arduous,  assembly-language  level  hand-tuning  can  now  be  accomplished 
automatically  —  by  compiler  optimization  techniques.  Many  of  these  are  code-motion  methods 
similar  to  those  used  in  instruction  scheduling.  Here,  however,  the  objective  is  to  achieve 
consistency  between  the  real-time  constraints  and  the  execution  characteristics  of  the  code. 
In  doing  this  we  use  the  observable  events  as  “signposts,”  which  constrain  the  places  where 
code  may  be  moved.  For  example,  the  locaJ  operation  “next_cind  *  F(obj -coords)”  can  be 
performed  during  the  delay  between  the  two  observable  events. 

Of  course,  the  greatest  challenge  lies  in  the  optimization  of  concurrent  programs,  since  it 
requires  inter-process  control-flow  analysis.  In  the  end,  this  problem  can  be  addressed  only  by 
close  interaction  between  the  compiler  and  the  real-time  scheduler.  Again,  this  requires  a  tight 
relationship  between  the  development  environment  and  a  scheduling  tool. 

4  Where  Should  Resources  Go? 

We  see  both  long  and  short  term  goals  in  the  unifying  real-time  design  and  implementation.  Some 
very  basic  technology  is  required  in  the  short  term,  such  as  on-line  debuggers  and  profilers,  as 
well  as  static  timing  analyzers.  For  example,  all  real-time  scheduling  theory  assumes  that  real¬ 
time  response  can  be  reasonably  bounded,  and  schedulability  analysis  is  then  carried  out  using 
these  bounds.  Yet  no  reliable  tools  exist  which  can  generate  these  bounds;  this  is  especially  true 
with  modem,  complex  computer  architectures. 

In  the  long  term,  we  believe  that  formal  design  methods  will  pay  very  large  dividends.  This 
is  particularly  true  for  those  efforts  aligned  with  implementation  efforts.  The  **gaps”  between 
design  and  implementation  occur  when  software  engineers  get  too  far  removed  from  system 
implementers.  The  most  successful  projects  will  span  all  levels  of  systems  development  -  through 
design,  integration,  testing,  operation  and  maintenance. 
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The  workshop  on  large,  distributed,  parallel  architecture,  Teal-dioe  systems  has  identified 
four  major  issues  regarding  the  development  of  these  systems.  My  position  is  that  a  fifth  issue 
underlies  these  four  and  must  be  addressed  as  part  of  any  solution  for  the  four.  This  issue  is  that  a 
breakthrough  in  standardization  and  quantification  of  computer  system  services  critical  to  this 
class  of  real-time  systems  will  have  to  be  made  and  incorporated  into  the  next  generation  of 
methodologies  and  real-time  scheduling  theories.  The  scheduling  function  and  its  interface(s)  in 
a  distributed,  parallel,  real-time  setting  is  one  example;  the  system  suppon  for  expressing  abstract 
notions  of  timeliness  and  system  response  above  and  beyond  the  conventional  concepts  of  priority 
and  event  is  another.  Other  real-time  functions  are  critical  and,  no  doubt,  need  to  be  disrnissed  in 
more  depth  in  the  workshop.  The  workshop  has  identified  design  methodologists  and  real-time 
scheduling  theorists  as  the  two  key  players  in  advancing  the  state  of  the  practice.  Because  they 
treat  themselves  as  separate  disciplines  these  two  groups  have  worked  independently  for  the  most 
pan.  Each  real-time  design  methodology  and  real-time  scheduling  approach  has  their  adherents 
and  critics.  Instead  of  adding  to  the  potential  debate  between  any  of  these,  I  submit  a  corollary  to 
my  position;  that  all  current  efforts  in  developing  deployable  methodologies  and  scheduling 
approaches  for  this  class  of  real-time  problem  are  systemically  limited.  A  third  key  player  needs 
to  be  added  for  design  methodology  and  real-time  scheduling  theory  to  have  a  substantive  impact 
on  the  industrial  base — the  computer  technologisL 

When  we,  as  users  and  developers  of  real-time  systems,  consider  the  concept  of  a  large, 
distributed,  parallel  architecture,  real-time  system,  we  are  confionted  with  a  situation  analogous 
to  that  which  confronted  computer  system  users  and  designers  of  15  to  20  years  ago.  At  that  time, 
the  use  of  multiple  mim-computers  was  becoming  an  alternative  to  the  use  of  centralized  main¬ 
frame  computers.  Some  of  the  problems  then  were  how  to  break  the  work  into  meaitingful  pieces 
to  match  the  limited  performance  of  a  mini-computer  and  how  to  orchestrate  the  processing 
globally. 

There  was  much  debate  and  question  as  to  whether  such  systems  were  a  practical 
alternative  to  centralized  mainframe  solutions.  For  a  period  of  time  the  only  workable  alternatives 
were  either  to  use  one  vendor's  proprietary  solution  solely  or  to  custom  build  one's  own.  Over  the 
long  run  these  alternatives  were  not  effective  because  of  the  lack  of  flexibility  and  high  cost 
Today  the  situation  is  quite  similar— the  practical  way  to  design  and  build  (real-time)  systems  is 
centralized  and  to  stay  close  to  one  vendor's  product  However,  metaphorically  speaking,  just 
because  your  only  tool  is  a  hammer  does  not  mean  every  problem  is  a  nail 
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The  obstacle  facing  pioneers  in  those  days  who  were  looking  for  flexible  and  cost  effective 
solutions  was  the  lack  of  infrastructure — of  standardized  network  protocols,  open  systems 
software  architectures,  and  modularized  hardware  interfaces.  Regardless  of  how  elaborate  their 
efforts  were  in  creating  design  methodologies  and  computer  algorithms  for  distributed  systems, 
the  results  were  systemically  limited  At  some  point,  the  development  of  new  standardized  (and 
modularized)  interfaces  was  essential — advanced  methodology  or  algorithm  development  had 
reached  a  point  of  diminishing  returns.  For  example,  in  the  mid  to  late  1970s,  one  key  rationale 
for  developing  single  user  workstadons  as  an  altemadve  to  time-shared  mainframes  was  to  avoid 
the  issues  of  large,  complicated  and  resource  intensive  operating  systems  found  on  the  latter.  In 
hindsight,  a  good  deal  of  operating  systems  research  and  development  (along  with  design 
methodologies)  of  that  era  became  as  baroque  as  they  did  because  they  were  trying  to  build 
solutions  given  near  impossible  conditions. 

We  face  this  situation  today.  What  can  be  done  about  this  systemic  condition?  This 
question  is  equivalent  to  the  fourth  workshop  issue  which  is  to  identify  the  most  promising 
techitical  areas  that  if  funded  would  improve  our  capability  to  design  and  build  such  systems.  My 
answer  is  that  there  is  an  opportunity  in  this  workshop  for  design  methodology  and  r^-time 
scheduling  researchers  to  look  hard  at  the  shortcomings  in  current  real-time  computing  engines 
that  limits  the  effectiveness  of  any  methodology  and  any  scheduling  theoretic  approach. 

For  example,  our  experiments  and  performance  benchmarks  at  NASA  Ames  Research 
Center  corroborate  the  findings  of  Norman  Howes  as  discussed  in  his  paper  titleo,  ‘Toward  a  Real- 
Time  Ada  Design  Methodology”.  Many  methodologies  as  well  as  the  one  explored  by  Howes  arc 
advanced  on  the  basis  of  desirable  design  principles.  We  have  found  as  did  Howes  Aat  these 
methods  when  tried  on  uniprocessors  (and  on  multiprocessors)  can  lead  to  elaborate,  highly 
complex  designs  that  are  fundamentally  flawed,  inefficient  or  unpredictable  in  performance.  We 
conclude  that  little  if  any  attention  is  given  to  quantifying  the  performance  of  the  results  of  these 
methodologies.  The  burden  does  not  all  rest  on  the  developers  of  these  methodologies;  the 
computer  was  a  significant  accomplice.  For  example,  synchronization  primitives  failed  in  special 
cases;  event  timer  functions  performed  ciraticaily;  in  general,  performance  of  real-time  functions 
often  was  affected  by  the  load  and  ordering  of  events.  We  found  this  to  be  the  case  across  three 
distinct  computing  platforms/Ada  software  environments. 

Because  the  technology  base  for  real-time  systems  is  so  brittie  and  quirky  there  is  little 
incentive  for  anyone  to  attempt  to  develop  a  methodology  that  embraces  the  level  of  detail 
necessary  to  be  of  substantive  use  in  a  real-time  project.  Any  substantive  attempt  is  tantamount  to 
the  development  of  a  project-specific  tool  and,  as  a  consequence,  litde  if  any  long  term  leverage 
is  gained-  This  leads  me  to  the  first  workshop  issue:  what  methodology  would  I  choose  if  I  had 
the  responsibility  of  developing  a  large,  distributed,  parallel  architecture  real-time  system? 

Faced  with  no  alternative  of  re-scoping  the  big  four  requirements — large,  distributed, 
parallel  architecture  arid  real-time— I  would  be  justified  to  expend  engineering  resources  to 
develop  a  project  specific  methodology.  The  methodology  would  be  based  on  repetitive  use  of 
cut  and  try  that  was  the  staple  of  design  approaches  used  by  electronic  engineers  in  the  sixties  and 
seventies.  Today,  CAD/CAM  tools  obviate  the  reliance  on ’cut  and  try’  because  the  physical 
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properties  (and  performance)  arc  thoroughly  understood  and  hence  predictable  (for  a  number  of 
silicon  processes  and  for  board  level  design).  We  simply  do  not  have  a  fundamental 
characterization  of  real-time  computing  engine  properties  today  that  we  can  so  thoroughly  rely 
upon.  As  a  consequence,  the  first  order  of  business  for  the  project's  methodology  is  to  establish 
a  pedigree  of  the  real-time  services  of  the  computing  engines  that  the  project  can  afford.  The 
second  is  to  find  real-time  scheduling  approaches  that  the  computing  engines  are  capable  of 
supporting  efficiently  while  meeting  system  response  requirements.  At  this  point  I  would  expend 
engineering  resources  to  investigate  the  augmentation  of  the  project's  methodology  with  desirable 
design  principles.  The  goal  would  be  to  seek  a  means  of  unifying  or  simplifying  the  overall  design 
or,  at  least,  some  intermediate  levels  of  the  design. 

The  'cut  and  try'  approach  is  one  alternative  to  the  third  workshop  issue:  What  is  the  best 
method  for  validating  these  real-time  systems  behave  as  specified?  Whether  it  is  the  best  is  a  moot 
point;  what  the  best  must  have  is  the  element  of  repetitive  use  of  'cut  and  try'  where  the  system, 
as  it  is  being  built  up,  is  incrementally  tested  against  realistic  tests  cases. 

One  workshop  issue  remains:  What  should  be  the  relationship  between  real-time  Design 
Theory  and  real-time  Scheduling  Theory  in  a  design  methodology  for  this  class  of  systems?  To 
me  this  is  the  most  problematic  issue  of  all;  its  resolution  is  the  key  to  successfully  building  a  new 
class  of  real-time  systems,  and  yet  our  current  attempts  to  address  it  are  systemically  limited  by 
the  current  state  of  real-time  services  and  interfaces.  This  point  was  brought  up  earlier  and  is  my 
position  for  this  workshop. 

In  closing,  let  us  consider  what  should  be  the  relationship  if,  for  example,  there  was  a  real¬ 
time  computing  engine  that  was  capable  of  supporting  a  broad  variety  of  real-time  scheduling 
algorithms  equally.  By  this  I  mean  that  the  differences  in  observed  performance  of  different 
algorithms  would  not  be  dependent  on  the  implementation  but  rather  on  the  computational 
characteristics  and  complexity  of  the  algorithm.  I  believe  that  this  would  increase  substantially  the 
collaboration  between  Design  Theory  and  Scheduling  Theory  researchers. 
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1.  DESIGN  METHODS 

During  the  past  three  years,  we  have  been  considering  the  problem  of  what  is  the  best 
way  to  design  real-time  systems,  especially  for  architectures  that  have  noore  than  one  pro¬ 
cessor.  We  realize  that  there  is  probably  no  one  best  way.  Indeed,  there  have  been  a  number 
of  proposals  by  various  authors  in  the  recent  literature  on  the  subjea  of  real-time  systems 
design.  At  IDA,  we  are  often  in  the  position  of  giving  advise  to  our  sponsors  on  methods 
and  techniques  to  use  on  real-world  projects.  Our  interest  in  this  area  was  motivated  by  our 
desire  to  be  able  to  give  sound  advice  in  these  situations.  We  originally  did  not  set  out  to 
develop  a  new  methodology,  but  tadier  to  determine,  which  anx>ng  die  many  proposed 
methods  were  the  most  promising. 

At  the  time  our  investigation  started,  two  methods  that  were  receiving  a  good  deal  of 
attention  were  the  method  of  Nielsen  and  Shumate  documented  in  their  textbook  [9]  and 
the  DARTS  method  of  Gomaa  [6].  While  we  have  read  about  many  methods,  we  have  only 
had  time  to  analyses  these  two  in  dqith  in  our  laboratory.  Our  experience,  based  on  reading 
about  other  experiments  (e.g.,  [7]),  and  on  implementing  both  Nielsen  and  Shumate’s  and 
Gomaa’s  examples  on  sequential  and  parallel  architecture  machines  and  comparing  them 
with  implementations  of  the  same  examples  designed  using  other  methods,  leads  us  to  be¬ 
lieve  that  the  technique  used  for  process  (task)  structuring,  i.e.,  what  your  model  of  concur¬ 
rency  is  based  on,  is  one  of  the  key  issues  in  the  design  phase. 


Both  the  Nielsen  and  Shumate  method  and  the  DARTS  method  have  been  used  to  de¬ 
velop  real-world  systems  and  both  methods  are  evolving  (e.g.,  [10],  [11]  and  [2]).  Also, 
both  methods  encompass  a  good  deal  more  than  just  process  structuring.  Both  of  these 
methods  belong  to  the  class  of  methods  that  are  extensions  of  the  concepts  of  structured  de¬ 
sign  to  the  field  of  concurrent  real-time  systems.  Another  class  of  real-time  design  methods 
that  is  emerging  is  the  class  of  methodologies  based  on  object  modeling  (e.g.,  [1]  and  [11]). 
Based  on  what  we  have  learned  so  far,  it  seems  that  both  of  these  classes  of  design  nrethods 
can  lead  to  designs  that  are  urmecessarily  complex.  The  reason  for  this  is  probably  easier 
to  see  in  the  case  of  object  oriented  design  techniques.  The  temptation  here  is  to  give  each 
object  its  own  thread  of  control  (process  or  task).  If  any  of  these  objects  must  interaa  fie- 
quently  with  other  objects,  it  can  readily  be  appreciated  that  there  can  be  significant  over¬ 
head  in  task-to-task  communication,  synchronizaticn.  or  both. 

So  far,  all  of  the  methods  we  have  looked  at  in  detail,  do  not  have  a  good  way  of  insur¬ 
ing  that  objects  that  must  interact  frequently  are  (usually)  in  the  Mmf:  thread  of  control.  In 
fact,  many  of  them  have  process  structuring  techniques  that  seem  to  encourage  placing 
strongly  interacting  objects  in  different  threads  of  control  This  not  only  leads  to  inefficien¬ 
cy,  it  leads  to  designs  that  are  overly  complex,  and  this  has  ramifications  in  the  area  of  in¬ 
tegration  testing,  validation  and  maintenance  of  the  system.  It  seems  to  us  that  what  is  most 
needed  in  these  methodologies,  is  a  method  for  insuring  that  strongly  interacting  objects  are 
assigned  to  the  same  thread  of  control  whenever  possible. 

In  our  lab  work,  we  experimented  with  trying  to  base  our  concurrency  model  on  the 
real-world  processes  that  were  occurring  in  the  problem  space  for  which  our  system  was  to 
operate.  The  results  were  improvement  of  throughput,  reduction  of  number  of  independent 
threads  of  control  and  reduction  in  code  size,  while  maintaining  or  inqrroving  timing  be¬ 
havior.  Our  designs  based  on  this  ’^process  modeling”  qrproach  also  proved  to  be  much 
more  portable  than  the  other  examples.  All  of  our  experiments  were  done  in  the  Ada  pro¬ 
gramming  language,  and  we  realize  that  some  of  the  improvement  we  were  seeing  be 
attributed  to  compensating  for  certain  inefficiencies  with  the  current  version  of  the  Ada  lan¬ 
guage.  It  is  reasonable  to  believe  that  in  the  future,  when  some  of  these  language  inefficien¬ 
cies  have  been  corrected  (e.g.,  with  Ada  9X),  that  the  comparison  might  not  be  as  dramatic. 
None-the-less,  we  believe  that  the  benefits  of  process  modeling  will  still  be  very  noticeable 
due  to  the  general  simplification  of  design  that  this  method  provides  with  respect  to  other 
methodologies  we  have  studied. 
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2.  RELATION  BETWEEN  DESIGN  THEORY  AND  SCHEDULING  THEORY 


It  is  our  belief  that  cunent  real-time  scheduling  theory  is  not  closely  related  to  most  of 
the  real-time  design  theories  that  have  been  advanced  recently.  Real-time  scheduling  theo¬ 
ry  is  based  on  the  time-line  model  that  models  the  time  frame  in  which  the  real-time  prob¬ 
lem  is  to  be  solved.  Until  recently,  the  time-line  model  has  been  so  widely  used  by  real-time 
practitioners  that  most  of  them  have  identified  this  model  with  reality.  The  time-line  model 
is  useful  for  reasoning  about  how  multiple  tasks  can  be  scheduled  on  a  single  processor  and 
about  whether  these  tasks  will  meet  their  deadlines.  Because  of  the  importance  of  predict¬ 
ability  in  real-time  systems,  these  considerations  have  been  considered  so  critical,  that  they 
have  dominated  real-time  design  to  the  point  where  real-time  design  is  conditioned  by 
schedulability  analysis. 

It  is  common  practise  in  real-time  design  today  to  divide  the  time-line  up  into  time  sloL> 
for  the  various  concurrent  tasks  comprising  the  real-time  system  and  assigning  time  bud¬ 
gets  to  each  slot  These  time  slots  are  then  scheduled  using  the  time  budgets  for  the  task 
execution  time  using  some  static  scheduling  algorithm  or  by  manually  fitting  the  slots  into 
a  cyclic  executive.  The  tasks  are  then  designed  to  try  to  meet  these  time  budgets.  If  that 
proves  to  be  impossible,  then  an  anempt  is  made  to  borrow  additional  time  from  the  bud¬ 
gets  of  other  tasks  that  do  not  need  as  large  a  budget  as  originally  assumed.  The  problem 
with  the  time-line  model  is  that  it  is  not  useful  for  reasoning  about  some  of  the  f undamental 
problems  associated  with  the  design  of  real-time  systems,  such  as  what  functions  of  the  sys¬ 
tems  should  be  grouped  into  a  separate  threads  of  control.  It  is  not  a  good  abstraction  for 
reasoning  aoout  how  these  individual  processes  or  tasks  should  be  designed,  and  it  does  not 
generalize  well  to  multiple-processor  systems.  Furthermore,  this  approach  introduces  sig¬ 
nificant  overtiead  in  order  to  insure  predictable  timing  behavior  by  forcing  the  design  to 
comply  with  an  imnatural  model. 

Instead  of  just  modeling  the  time  frame  in  which  a  real-time  problem  is  to  be  solved  (as 
with  the  time-line  model),  it  would  be  useful  to  be  able  to  nxxiel  the  problem  space  in  which 
the  problem  is  to  be  solved.  The  benefits  from  object  oriented  design  in  the  non-real-time 
problem  domain  suggest  that  modeling  the  problem  space  yields  information  tiiat  is  rele¬ 
vant  to  how  a  system  should  be  designed.  In  the  past  few  years,  new  design  methods  for 
real-time  systems  have  been  emerging  whose  underlying  noodels  are  either  a  generalization 
of  the  time-line  model  or  a  replacement  for  it  The  earliest  departure  from  die  time-line 
model  seems  to  have  been  in  the  area  offunction-driven  scheduling,  e.g.,  [8]  and  [13].  Here, 
a  time  value  function  or  an  importance  function  is  used  to  blur  or  spread  the  concept  of  a 
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dead-line.  Next  came  the  class  of  methodologies  that  are  an  attempt  at  generalizing  the 
techniques  of  structured  analysis  and  structured  design  to  the  real-time  domain,  e.  g.,  [6], 
[  14]  and  [9].  Thereafter  came  the  class  of  methodologies  that  attempt  to  extend  the  concepts 
of  object  oriented  design  to  the  real-time  domain.  One  recent  real-time  conference  featured 
over  a  half  dozen  papers  proposing  various  “object  oriented  real-time  methodologies". 
Concurrent  with  the  emergence  of  the  object  oriented  real-time  methodologies  was  the 
emergence  of  a  small  number  of  papers,  e.g..  [3],  [12],  [7]  arrd  [5]  that  suggest  using  a  rxKxi- 
el  similar,  but  somewhat  different  to  the  object  trxxlel  that  we  will  refer  to  as  process  mod¬ 
eling. 

We  believe  all  of  these  generalizations  and  alternatives  to  the  time-line  nrodel  supply 
additional  information  that  may  be  helpful  in  the  design  of  real-time  systems,  but  that  pro¬ 
cess  modeling  offers  the  greatest  potential.  Further,  it  is  our  position  that  real-rime  design 
theory  and  real-time  scheduling  theory  should  have  a  closer  relationship,  and  that  this 
should  be  accomplished  by  a  rethinking  of  real-time  scheduling  theory  so  that  eventually  it 
will  support  these  new  design  methods.  At  the  present  time,  real-time  scheduling  theory  is 
usually  based  on  a  number  of  simplifying  assumptions  that  in  effect  assume  that  the  sys¬ 
tems,  to  which  the  theory  is  to  apply,  have  already  been  designed.  For  instance,  they  malcr. 
the  following  kinds  of  assumptions:  (1)  the  (perhaps  worst  case)  execution  times  of  all  tasks 
are  known,  (2)  most  of  the  tasks  are  periodic,  with  the  occasional  need  to  handle  an  asyn¬ 
chronous  request  for  service,  (3)  the  deadline.^  for  all  tasks  are  known  in  advance.  As  a  re¬ 
sult,  current  scheduling  theory  does  not  suppon  current  design  theory  very  well,  and  in 
order  to  use  it,  one  has  to  allow  the  design  to  be  constrained  by  the  time-line  model. 

More  and  more,  these  new  design  methods  are  being  aneiiq)ted  with  the  knowledge  that 
the  determination  that  the  system  will  behave  predictably  and  meet  its  riming  requirements 
will  have  to  be  ascertained  via  testing  of  the  finished  design  either  by  prototypes  or  simu¬ 
lation.  While  this  is  a  workable  approach  from  a  practical  point  of  view,  it  is  somewhat  un¬ 
desirable  in  that  there  is  no  underiying  (theoretical  or  mathematical)  reasoning  about  why 
the  system  behaves  (with  respea  to  timing  behavior)  as  it  does.  It  is  therefore  important  that 
scheduling  techniques  be  developed  that  support  design  methods  rather  than  force  designs 
by  means  of  unnatural  models  or  be  employed  as  forcing  functions  to  force  systems  de¬ 
signed  via  other  models  to  try  to  meet  the  desired  riming  requirements. 

In  the  dissertation  [13]  Strayer  essentially  argues  for  real  time  systems  without  dead¬ 
lines  that  do  the  right  thing  at  each  instance  of  time  rather  than  attempt  to  meet  deadlines. 
He  argues  that  we  can  be  assured  that  they  do  the  right  thing  at  each  instance  of  time  bc- 
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cause  they  are  function  driven  designs  and  his  importance  functions  insure  that  the  system 
is  doing  the  most  important  thing  at  any  instance  of  time.  While  we  are  not  entirely  con¬ 
vinced  by  Strayer’s  arguments,  wc  believe  that  the  approach  he  suggests  would  be  superior 
to  trying  to  meet  deadlines,  if  it  can  be  accomplished  via  importance  functions.  One  advan¬ 
tage  of  a  system  that  was  designed  to  do  the  right  thing  at  any  given  instant  would  be  that 
it  would  therefore  always  behave  in  the  best  possible  way  during  transient  overload. 

Strayer’s  concepts  are  yet  to  be  proven  in  the  real-world  of  system  implementation  and 
testing.  However,  we  believe  that  an  approach  along  these  lines  my  offer  real  promise.  Sev¬ 
eral  real-time  systems  developers  that  we  have  talked  to,  never  even  try  to  apply  current 
scheduling  theory  because  they  believe  it  to  be  too  restrictive  or  too  unrealistic  to  merit  con¬ 
sideration.  Consequently,  systems  are  designed,  prototyped,  tested,  redesigned,  retested, 
etc.  until  a  workable  solution  emerges.  The  problem  with  this  approach  is  that  it  is  not  based 
on  well  understood  (perhaps  proven)  design  principles  that  are  supported  by  scheduling 
theory.  Then  when  requirements  change  somewhat  or  when  the  system  is  to  be  redesigned 
to  meet  similar  but  different  requirements,  tire  whole  process  has  to  be  started  over.  One 
process  control  company  design  department  we  talked  to  stated  that  they  tried  to  mitigate 
this  redesign  problem  by  starting  with  a  similar  system  if  possible,  modifying  it  so  that  it 
might  meet  the  new  requirements  and  then  testing  it  By  starting  at  this  point  they  were  of¬ 
ten  able  to  save  some  time  witii  respect  to  the  alternative  of  starting  from  scratch. 

3.  VALIDATING  DISTRIBUTED  OR  PARALLEL  DESIGNS 

We  do  not  have  much  of  a  position  on  the  validation  of  such  systems,  primarily  because 
general  experience  with  validating  systems  of  this  class  seems  to  be  lacking  and  our  own 
experience  is  limited.  However,  based  on  our  experience  measuring  the  performance  of 
small  but  representative  real-time  systems  on  botii  single  and  multiprocessor  machines,  and 
redesigning  them  for  better,  performance,  we  believe  that  the  detailed  testing  of  parallel 
real-time  systems  offers  a  wealth  of  insight  into  the  (often  unexpected)  behavior  of  these 
systems.  Consequently,  we  believe  that  testing  will  currently  have  to  play  the  primary  role 
in  verification  of  systems  of  this  class. 

While  we  are  not  very  knowledgable  in  the  area  of  formal  methods  and  proving  of  pro¬ 
grams,  we  think  that  this  area  has  an  important  role  to  play  in  the  future  of  this  class  of  sys¬ 
tems.  However,  since  our  experience,  and  much  of  the  experience  of  other  researchers  are 
with  languages  that  do  not  have  provable  semantics,  wc  cannot  conceive  of  how  parallel 
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real-time  systems  could  be  validated  without  extensive  testing  at  the  present  time.  On  the 
other  hand,  attempts  at  the  formal  specification  of  such  systems  still  seems  desirable  be¬ 
cause  it  provides  an  unambiguous  statement  of  what  the  system  should  do  which  can  pro¬ 
vide  insight  into  how  tests  might  be  constructed  to  determine  if  the  system  behaves  as 
specified  or  not  Determining  how  to  construct  a  test  to  determine  if  a  parallel  real-time  sys¬ 
tem  meets  a  specific  requirement  is  often  highly  non-trivial.  Techniques  employed  for  test¬ 
ing  of  sequential  or  concurrent  systems  often  do  not  apply  when  the  system  has  parallel 
threads  of  control.  Our  experience,  for  example,  with  debuggers  on  parallel  machines  is  that 
they  often  cannot  trace  all  the  threads  of  control  that  are  executing  simultaneously  in  a 
meaningful  way.  With  sequential  debuggers  the  program  can  be  stopped  and  restarted  with¬ 
out  affecting  the  logic  of  the  program.  But  with  a  parallel  program,  what  does  it  mean  to 
stop  one  of  the  threads  of  control  while  the  others  continue  on?  Or  what  does  it  mean  to 
stop  all  threads  of  control  simultaneously,  because  in  practise,  this  cannot  be  achieved. 

Validation  of  parallel  real-time  code  might  be  undertaken  with  some  sort  of  “non-intru- 
sive”  monitor  such  as  the  product  parasight  offer  by  Encore  for  their  parallel  machines.  We 
have  not  been  able  to  experiment  with  this  product  because  our  real-time  test  beds  are  writ¬ 
ten  in  Ada  and  parasight  currently  does  not  work  with  the  Ada  language.  Basically,  the  idea 
here  is  that  one  of  the  multiple  processors  is  used  to  gather  irtformation  in  a  non-intrusive 
fashion  (that  does  not  affect  timing )  about  the  behavior  of  the  code  being  executed  on  the 
other  processors. 

For  our  purposes,  we  have  found  that  code  instrumentation  works  well  for  answering 
many  of  the  questions  related  to  behavior  of  a  running  system.  Our  experience  seems  to  in¬ 
dicate  that  when  code  instrumentation  can  be  used,  it  should  be  designed  into  the  system 
from  the  begiiuung  with  a  view  toward  always  having  it  there,  because  its  removal  alters 
the  timing  so  there  are  no  guarantees  that  the  non-instrumenied  code  will  behave  exactly  as 
what  was  observed  in  the  laboratory  with  the  instrumented  code.  Our  experience  indicates 
that  a  great  deal  of  information  can  be  learned  about  the  execution  behavior  of  a  program 
with  only  a  small  overhead. 

Consequently,  our  position  is  that  at  the  current  state  of  the  practise,  designing  code  in¬ 
strumentation  into  the  parallel  or  distributed  real-time  system  specifically  for  testing  if  the 
system  meets  all  or  most  of  its  requirements  is  the  best  way  to  do  validation.  The  instru¬ 
mentation  code  will  remain  in  the  system  for  the  life  of  the  system.  If  at  a  later  time,  the 
system  is  changed  to  meet  new  requirements,  new  instrumentation  code  will  have  to  be 
added  to  validate  these  new  features,  and  the  system  will  have  to  be  revalidated. 


4.  PROMISING  AREAS  WHERE  RESOURCES  MIGHT  BE  APPLIED 


Contrary  to  popular  opinion,  we  do  not  believe  that  it  would  be  profitable  to  apply  re¬ 
sources  at  this  tinvi  to  support  the  development  of  automated  design  tools  for  parallel  or 
distributed  real-time  systems,  because  we  feel  that  the  design  process  for  this  class  of  sys¬ 
tems  is  not  yet  well  enough  understood  to  warrant  such  tools.  Such  tools  would  only  help 
us  to  the  MTnff  mistakes  we  are  currently  making  at  a  faster  rate.  On  the  other  hand, 
we  believe  that  an  investment  in  automated  testing  tools  that  would  help  us  better  under¬ 
stand  the  behavior  of  executing  parallel  or  distributed  systems  would  be  a  valuable  aid  in 
correcting  current  design  flaws  and  for  learning  more  about  how  the  behavior  of  this  class 
of  systems  is  modified  by  using  different  design  techniques. 

Currendy,  we  believe  that  the  tools  that  would  benefit  a  development  project  for  a  sys¬ 
tem  of  this  class  the  most  are  the  classical  software  engineering  tools  for  configuration 
management  and  project  control  which  are  already  readily  available.  Next,  we  think  that 
tools  that  would  assist  in  the  simulation  and  evaluation  of  design  alternatives  would  provide 
the  most  benefit  Such  tools  for  this  class  of  system  are  not  so  readily  available.  Thereafter, 
would  come  automated  tools  for  testing  and  validation  which  also  are  not  readily  available. 
We  believe  that  both  simulation  and  prototyping  are  necessary  in  the  project  life-cycle  for 
this  class  of  systems  at  the  present  time.  Ihis  iterative  approach  is  time  consuming  and  is 
where  new  automated  suppon  would  provide  the  most  realizable  short  term  gain. 
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Abstract 

Many  lealtuoe  coopiter  system  manu&ctuiers  and  users  need  products  that  are  scaleable. 
which  have  consistent  int^aces,  functional  components,  and  ^velopment  environiiKnts; 
and  which  span  a  wide  spectrum — ^from  small,  sinqrle,  centralized,  tactical  subsysteim  to 
large,  con:9>lex.  decentralized,  mission-ciitical  systems.  This  requires  realtime  OS’s  v^ch 
are  highly  scaleable  in  a  number  of  essential  respects,  whereas  all  extant  ones  are  only  rrrod- 
estly  scaleable.  A  particularly  irrqrortant,  and  hitherto  intractable,  form  of  realtime  OS  scale- 
ability  is  the  degree  of  timeliness  predictability— i.e.,  ’’hardness.”  The  Benefit  Accrual  Model 
is  a  framewodc  that  generalizes  die  traditional  special  cases  of  as  tiiw.  constraints, 

and  unanimous  optinuim  as  the  scheduling  critericsi;  this  enables  timelinKct  to  be  scaled — 
dynamicaUy— over  a  wide  spectrum  of  realtime  ’’hardness”  and  “softness”  in  a  way. 

Best-effort  scheduling  algorithms  exploit  this  generality.  The  progenitor  of  tiiis 
paradigm  was  created  in  1977  and  introduced  in  the  Al^  decentralized  realtime  OS  irmiH 
at  Camegie-Mellon  University  in  1985;  the  current  version  is  being  developed  and  incorpo¬ 
rated  by  Digital  Equipment  into  a  new  version  of  the  Mach  3  kernel  for  a  highly  y-aieahlr 
realtime  OS  architecture. 

1  introduction 

Many  realtime  computer  system  manufactums  and  users  need  products  that  are  scaleable. 
which  have  consistent  interfaces,  functional  components,  and  development  environments;  arid 
which  span  a  wide  spectrum — ^from  small,  simple,  centralized  tactical  subsystems,  to  large, 
complex,  decentralized,  mission-critical  systems,  as  required  by  any  given  application. 

Suitably  high  degrees  of  scaleability  benefit  the  conqiuter  manufacturer,  solution  supplier, 
and  user,  by  lowering  software  (and  thus  system)  life  cycle  costs  through  such  benefits  as:  wid¬ 
er  utility;  easier  portability  and  investment  protection;  and  improved  adaptability  to  evolving 
application  needs  and  technologies~-especially  valuable  in  long-life  realtime  systems. 

Realtime  software  in  general,  and  operating  systems  in  particular,  are  much  more  difficult  to 
make  scaleable  than  is  hardware.  No  extant  realtime  operating  system  products  are  more  than 
modestly  scaleable,  at  best;  different  kinds  and  degrees  of  realtime  needs  are  met  with  different 
realtime  operating  systems. 

There  are  many  dimensions  in  which  realtime  systems  and  operating  systems  are  more  or 
(usually)  less  scaleable;  especially  important  ones  include  functionality,  decentralization,  per¬ 
formance,  predictability  (of  timeliness),  and  fault  tolerance.  Of  these,  predictability— often  in- 
forxr^y  called  “hardness”  and  “softness”— is  the  most  technically  (and  sociologically)  chal¬ 
lenging  to  make  scaleable,  because  it  requires  an  improved  perception  and  understanding  of 
what  ‘realtime  fundamentally  means;  an  analogy  is  the  inq)toved  understanding  of  gravity 
that  was  required  to  make  certain  aspects  of  physics  more 

The  conventional  realtime  dichotomy  of  “hard”  and  “soft”  realtime  is  too  oversimplified  to 
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be  scaleable:  “hard”  as  being  “detenninistic”  is  an  unrealistic  special  case;  and  “soft”  as  being 
all  other  cases  is  imprecise  and  ad  hoc.  This  paper  describes  a  basis  for  describing  and  manag¬ 
ing  highly  scaleable  predictability  of  timeliness,  over  a  wide  spectrum  of  “hardness”  and  “soft¬ 
ness,”  in  a  well-defined  and  unified  way:  the  Benefit  Accrual  Model.  T o  prepare  for  the  descrip¬ 
tion  of  this  model,  we  first  discuss  our  understanding  of  realtime,  determinism,  and  predictabil¬ 
ity.  One  of  the  strengths  of  this  model  is  that  it  creates  the  opportunity  for  employing  best-effort 
realtime  scheduling  algorithms,  as  well  as  conventional  algorithms;  an  overview  of  this  topic 
is  provided  at  the  end  of  this  paper. 

2  Realtime,  Determinism,  and  Predictability 

The  traditional  realtime  viewpoint  and  terminology  arose  from  the  historical  emergence  of 
realtime  computing  in  the  context  of  relatively  small,  simple,  centralized,  low-level  sampled- 
data  subsystems.  Realtime  systems  are  popularly  dichotomized  as  “hard”  versus  “soft.”  “Hard” 
realtime  conventionally  is  defined  as  being  “detenninistic”  in  the  sense  that  the  only  critical 
computations  are  those  with  deadlines,  and  the  scheduling  objective  is  that  all  these  computa¬ 
tions  must  always  meet  their  deadlines,  otherwise  the  system  has  failed  catastrophically.  “Soft” 
realtime  conventionally  is  defined  as  being  “non-deterministic”  in  the  sense  that  missing  a 
deadline  is  not  necessarily  a  catastrophic  system  failure — i.e.,  “soft”  means  “not  hard:”  in  some 
cases,  missing  certain  deadlines  under  certain  conditions  may  be  acceptable;  in  other  cases,  the 
time  constraints  are  not  really  deadlines  but  preferred  times  or  timp  ranges.  “Predictability”  is 
commonly  regarded  as  the  metric  for  hardness  and  sofmess,  although  the  term  is  rarely  de¬ 
scribed,  much  less  defined.  This  traditional  realtime  viewpoint  and  terminology  is  too  impre¬ 
cise,  and  the  resulting  resource  management  concepts  and  techniques  are  too  oversimplified, 
to  be  feasibly  scaled  up  for  larger,  more  decentralize  systems. 

We  consider  a  computing  system  or  operating  system  to  be  a  realtime  one  to  the  extent  (this 
is  not  a  binary  attribute)  that  time — physical  or  logical,  absolute  or  relative — is  part  of  the  sys¬ 
tem’s  logic  (analogous  to  errors  being  states  in  a  fault  tolerant  system);  and  in  particular,  to  the 
extent  that  resources  are  managed  explicitly  to  satisfy  the  completion  timp-  constraints  of  the 
applications’  (and  thus  its  own)  computations,  whether  statically  or  dynamically. 

Time  constraints,  such  as  deadlines,  are  introduced  primarily  by  natural  laws— e.g.,  physical, 
chemical,  biological — which  govern  an  application’s  behavior  and  establish  acceptable  exe¬ 
cution  completion  times  for  the  associated  realtime  computations.  The  performance  of  realtime 
systems  is  evaluated  in  terms  of  the  magmtude  of  the  time  constraints  which  be  satisfied 
with  given  computing  hardware.  Specifically,  we  define  timeliness  as  the  metric  of  how  suc¬ 
cessfully  the  system  is  able  to  satisfy  its  timp  constraints. 

“Real  fast”  is  often  confused  with  “realtime.”  A  computing  system  or  operating  system  may 
satisfy  its  applications  computation  completion  time  constraints  implicitly  (by  good  luck)  or 
by  hardware  brute  force  (e.g.,  MS-DOS  on  a  200  SPECMARK  computer).  Such  systems  may 
successfully  operate  in  realtime  and  (in  the  latter  case)  could  be  rational,  cost-effective  solu¬ 
tions  for  certain  applications — but  by  our  definition  they  are  not  realtime  systems,  because  they 
do  not  employ  realtime  (time  constraint  driven)  resource  management 

Deterministic  computation  in  the  realtime  context  literally  means  that  the  computation’s  tim¬ 
ing  and  timeliness  are  known  absolutely,  in  advance  [1] — there  is  no  uncertainty  about  any  pa¬ 
rameters  of  the  computation  (e.g.,  arrival  time,  execution  duration)  and  its  future  execution  en¬ 
vironment  (e.g.,  resource  dependencies  and  conflicts  due  to  other  computations)  which  could 
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affect  its  timeliness  (at  least  barring  faults,  and  preferably  within  acceptable  fault  coverage  pre¬ 
mises).  Thus,  deterministic  scheduling  can — ^indeed,  must  [2] — be  done  off-line.  There  are  very 
few  actual  realtime  applications  and  systems  which  (inherently  or  forcibly)  meet  this  determin¬ 
ism  criterion  of  absolute  timeliness  certainty — ^most  are  subject  to  some  inevitable  dynamic 
fluctuations  and  variabilities  of  computation  and  communication  timing,  due  to  input  data  ar¬ 
rivals,  resource  dependencies  and  conflicts,  overloads,  and  hardware  and  software  exceptions 
(not  to  mention  faults,  errors,  and  failures  outside  the  presumed  coverage). 

We  regard  a  computation’s  timing  and  timeliness  to  be  non-detenninistic  but  predictable  in 
the  sense  that  they  can  be  estimated  acceptably;  determinism  is  the  maximum,  ideal,  case  [3] 
which  can  only  be  asymmtotically  approached  in  practice  (at  several  kinds  of  costs).  Predict¬ 
ability  implies  that  all  parameter  values  of  the  computation  (e.g.,  arrival  time,  execution  dura¬ 
tion)  and  its  future  execution  environment  (e.g.,  resource  dependencies  on,  and  conflicts  with, 
other  computations)  are  known  sufficiently  well,  and  that  the  computation’s  timeliness  is  gov¬ 
erned  by  processes  (particularly  the  scheduler)  whose  time  evolution  is  sufficiently  well  con¬ 
trolled.  The  degree  of  predictability  is  then  established  according  to  the  application-specific  in¬ 
terpretation  of  “acceptably”— e.g.,  it  may  be  desired  that  the  estimate  be  extremely  precise  in 
most  instances  at  the  expense  of  being  less  so  in  the  remainder,  versus  being  less  but  equally 
precise  in  every  instance. 

The  timing  estimations  may  be  obtained  by  formal  analysis,  simulation,  empirical  measure¬ 
ment,  or  code  examination.  The  resulting  predictability  of  timeliness  (e.g.,  for  response  or  com¬ 
pletion  time)  may  be  expressed  in  a  variety  of  ways — e.g.:  an  assured  upper  bound  (a  lesser  or 
least  upper  bound  since  any  system’s  timeliness  could  be  said  to  be  predictable  by  the  choice 
of  one  high  enough);  or  in  terms  of  discontinuous  rules  which  relate  various  execution  contexts 
to  estimated,  bounded,  or  even  certain  timeliness  values  (those  contexts  being  ones  which  are 
most  likely,  or  most  important,  or  just  most  readily  relatable  to  timeliness  estimations);  or  a 
probability  distribution  function  of  timeliness  values. 

When  the  parameters  of  the  computation  and  its  future  execution  environment  are  known  in 
the  form  of  random  variables  so  that  their  uncertainty  is  characterized  by  probability  distribu¬ 
tion  functions  (a  reasonable  presumption  in  many  cases),  the  computation’s  timeliness  may  be 
amenable  to  stochastic  analysis — e.g.,  the  probabilities  of  execution  completion  at  different 
times  can  be  dt  u^ul  T  t  certain  siraations  t.uu:  as  with  deterministic  scheduling,  many  of  the 
most  interesting  cases  are  either  known  to  be  intractable  or  still  defy  explicit  solution).  How¬ 
ever,  the  contexts  and  thus  approaches  of  stochastic  scheduling  are  predominately  oriented  to¬ 
ward  non-realtime  objectives,  such  as  makespan  or  flowtime  [4],  which  are  analytically  and 
computationally  easier  than  stochastic  scheduling  to  meet  due  times  [5]  (and  for  which  there  is 
greatei  application  demand  than  from  the  realtime  community). 

The  parameters  of  many  realtime  systems,  especially  in  higher  level,  larger  scale,  and  more 
decentralized  contexts,  are  often  too  asynchronous — ^i.e.,  intermittent,  irregular,  and  interde¬ 
pendent — to  have  known  or  tractable  probability  distribution  functions;  thus,  these  systems 
must  be  treated  as  non-stochastically  non-detenninistic,  for  which  the  scheduling  technology 
is  still  in  its  infancy. 

Independent  of  the  computation  and  environment  parameters,  a  computation’s  timeliness 
predictability  also  depends  on  the  time  evolution  characteristics  of  the  scheduler.  It  is  normally 
taken  for  granted  that  realtime  scheduling  algorithms  per  se  are  deterministic  even  if  the  pa¬ 
rameters  are  noL  Nevertheless,  algorithms  in  general  and  scheduling  algorithms  in  particular 
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sometimes  advantage  (e.g.,  for  simplicity)  of  making  non-detenninistic  decisions;  sto¬ 
chastic  schedulers  have  proven  to  be  successful  in  certain  distributed  systems  (e.g.,  [6][7));  and 
non-stochastic  decision  making  occurs  not  only  in  Petri  Nets  and  certain  programming  lan¬ 
guages,  but  even  in  realtime  scheduling  algorithms  (e.g.,  [8]).  Most  significant,  however,  is  the 
strong  tendency  for  highly  physically  (9]  and  logically  [10]  decentralized  schedulers  to  enter 
chaotic  regimes  [11]. 


Both  determinism  and  predictability  are  independent  of  time  constraint  (e.g.,  deadline,  re¬ 
sponse  time)  magnitudes — a  system  may  be  deterministically,  or  highly  predictably,  too  slow 
with  respea  to  some  particular  time  constraint  magnitude  requirement  Thus,  timeliness  as  a 
realtime  performance  metric  includes  both  the  predictability  and  magnitude  dimensions. 

According  to  our  definition  of  realtime,  many  computing  systems  are  realtime  to  some  rele¬ 
vant  degree:  slighdy — e.g.,  a  payroll  system  which  automatically  generates  the  checks  on  time 
(not  early  or  late);  a  little  more  so — e.g.,  disk  driver  software;  considerably  more  so — e.g.,  an 
OLTP  system  which  automatically  performs  financial  trading  based  on  dynamic  market  param¬ 
eters;  highly — most  (but  not  all)  computing  normally  thought  of  as  realtime. 

Furthermore,  the  realm  of  realtime  computing  is  broadening  beyond  traditional  low-level  tac¬ 
tical  subsystems,  to  include  larger,  more  complex,  more  decentralized  strategic  systems  for 
mission  management.  This  class  of  realtime  application  typically  coordinates  multiple  entities 
which  are  cooperating  adaptively  to  perform  a  mission-critical  realtime  task — such  as  manu¬ 
facturing  a  vehicle,  repairing  a  damaged  reactor,  conducting  an  air  engagement — despite  their 
individually  inaccurate,  incomplete  views  of  an  inherently  dynamic  and  uncertain  application 
and  system  state.  Under  such  circumstances,  both  the  application  and  computing  system  soft¬ 
ware  (e.g.,  OS)  must  make  a  best  e^ort  to  accommodate  dynamic  and  non-deterministic  mis¬ 
sion  and  resource  conditions  in  a  robust,  adaptable  way  so  as  to  undertake  that  as  many  as  pos¬ 
sible  of  the  most  important  computations  are  as  acceptable,  in  the  time  and  other  domains,  to 
the  application  as  possible  [9]. 

There  has  never  been  a  conceptual  or  technological  ftamework  which  could  coherently  en¬ 
compass  all  these  degrees  of  realtime.  Consequently,  realtime  computing  concepts  and  tech¬ 
niques  for  different  systems  are  ad  hoc  and  largely  disjoint  from  each  other,  which  causes  these 
differences  in  degree  to  become  differences  in  kind.  This  incoherence  limits  the  kinds  of  real¬ 
time  systems  that  can  be  built,  and  the  cost-effectiveness  of  those  that  are  built — ^in  particular, 
it  impedes  the  constmction  of  computing  systems  which  are  scaleable  in  degree  of  timeliness 
predictabiliQ',  and  thus  in  other  important  dimensions  such  as  functionality,  complexity,  and 
decentralization,  which  require  various  degrees  of  predictability. 

The  developing  paradigm  of  timeliness  described  here — the  Benefit  Accrual  Model — offers 
a  more  systematic,  general,  and  realistic  framework  which  we  believe  can  significantly  reduce 
these  limitations  of  classical  realtime  perspectives  and  technology.  It  provides  a  comprehen¬ 
sive  method  for  expressing  time  constraints  and  scheduling  objectives  that  encompasses  a  wide 
spectrum  of  realtime  “hardness”  and  “softness”  in  a  scaleable,  unified  way. 


3  The  Benefit  Accrual  Model  Of  Timeliness 

Introduction 

We  consider  a  realtime  computation  to  be  a  segment  of  a  computational  entity  (such  a  thread, 
task,  or  process)  subject  to  a  completion  tima  constraint  (such  as  a  deadline). 

We  define  a  time  constraint  to  be;  the  specification  of:  a  time  period  during  which  completion 
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of  the  realtime  computation’s  execution  affects  the  temporal  component  of  its  acceptability; 
and  that  affect  (e.g.,  completing  before  the  deadline  is  acceptable,  and  otherwise  is  unaccept¬ 
able). 

A  riftip  constraint  is  manifest  in  the  computation  program  as  a  demarcated  region  of  code 
whose  execution  completion  time  is  subject  to  the  time  constraint.  A  computational  entity  may 
include  multiple  realtime  computations — sequentially  or  concurrently  (i.e.,  nested),  as  shown 
in  Figure  1. 


TC, 


Figure  1:  Time  constraints  manifest  as  demarcated  code  regions 

The  classical  deadline  imposes  a  binary  partitioning  of  a  computation’s  completion  time 
range  into  either  acceptable  (prior  to  the  deadline),  or  not  (after  the  deadline),  as  illustrated  in 
Figure  2.  The  semantics  of  “not  acceptable”  are  specific  to  the  computation  and  application — 
e.g.,  non-productive  or  counter-productive  in  some  way. 

Often  non-deterministic  execution-time  variabilities  make  it  very  useful  to  have  “softer” — ^in 
the  sense  of  non-binary — relationships  between  when  a  realtime  computation  completes  exe¬ 
cution,  and  the  temporal  acceptability  of  that  computation.  A  realistic  example  of  such  a  softer 
time  constraint  is  t^t  if  a  particular  computation  cannot  be  completed  at  an  optimum  time — 
i.e.,  before  its  “deadline” — then:  completing  it  a  little  tardy  is  suboptimum,  but  better  than  not 
completing  it  at  all;  however,  completing  it  v^  tardy  is  worse  than  not  completing  it  at  all. 
See  Figure  3. 

The  description  of  this  example  indicated  that  the  “deadline”  was  redefined  to  be  the  end  of 
the  optimum,  rather  than  the  acceptable,  completion  time  zone.  The  normal  definition  of  dead¬ 
line  (Figure  4)  would  cause  popular  realtime  scheduling  algorithms  to  complete  more  compu¬ 
tations  in  the  suboptimum  zone  than  was  intended  by  the  example  soft  time  constraint.  Thus, 
such  non-binary  completion  time/acceptability  reladonsbips  raise  questions  such  as:  which 
time  is  best  considered  the  “deadline,"  and  what  the  odier  completion  delimiting  times  are;  how 
are  these  specified  times  used  for  scheduling. 

The  execution  of  each  realtime  computation  is  not  necessarily  scheduled  to  maTimiTi*  its  in¬ 
dividual  temporal  acceptability.  A  realtime  system  (normally)  has  a  multiplicity  of  realtime 
computations  which  are  executed  in  a  partial  order  according  to  a  scheduling  criterion:  a  col¬ 
lective  temporal  acceptability  criterion  for  a  set  of  realtime  computations,  in  terms  of  their  in¬ 
dividual  time  constraints— e.g.,  the  classical  “hard  realtime”  criterion  that  all  realtime  compu¬ 
tations  meet  aU  their  deadlines.  In  some  cases — such  as  the  classical  “hard  realtime”  one — 
there  is  an  equivalence  between  the  individual  and  collective  temporal  acceptability  criteria 
(e.g..  “each”  and  “all”). 
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Figure  2:  The  classical  deadline 
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Figure  3:  A  “softer*  time  constraint 
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Figure  4;  Combination  deadline  and  softer  time  constraint 

A  particular  scheduling  criterion  applied  to  a  particular  set  of  realtime  computations  may  re¬ 
sult  in  a  subset  of  them  whose  individual  time  constraints  will/would  not  be  optimally  satisfied; 
how  and  when  this  is  resolved  is  situation-specific  (the  classical  “hard  realtime”  criterion  usu¬ 
ally  implies  this  condition  is  an  overload  wUch  must  be  avoided  i  priori). 

The  traditional  “hard  realtime”  scheduling  aiterion  is  a  single  special  case  which  does  not 
apply  to  non-binary  time  constraints,  such  as  those  which  have  multiple  completion  time  zones 
or  redefined  “deadline.”  “Softef’  time  constraints — ^in  the  sense  of  non-binary  completion  time 
acceptability — ^necessitate  associated  “softer” — ^in  the  sense  of  non-unanimous  and  non-opti¬ 
mum — scheduling  criteria.  In  the  context  of  our  example,  die  softer  criterion  is  that  the  maxi¬ 
mum  possible  number  of  computations  complete  in  tte  optimum  zone,  and  all  the  remainder 
complete  in  the  suboptimum  zone. 

Traditional  “soft”  realtime  scheduling  criteria  are  disparate,  ad  hoc,  and  imprecise;  thus,  they 
do  not  offer  a  basis  for  systematically  expressing  scheduling  criteria  for  non-binary  time  con¬ 
straints. 

In  the  Benefit  Accrual  Model,  a  time  constraint  is  a  generalization  of  the  conventional  “hard 
deadline”  because  the  conventional  “deadline”  and  “hard  realtime”  scheduling  criterion  in¬ 
volve  time  diiecdy  and  are  well-defined  (contrary  to  the  state  of  conventional  “soft"  realtime). 

The  Benefit  Accrual  Model  is  based  on  two  concepts;  a  ben^t  fimction,  which  generalizes 
the  classical  “deadline”  of  a  realtime  computation;  and  a  benefit  accrual  function,  which  gen¬ 
eralizes  the  classical  “hard  realtime”  scheduling  criterion  that  a  set  of  computations  always 
meet  all  its  deadlines. 
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This  model  generalizes  the  author’s  earlier  concept  of  “time-value  function’’  resource  sched¬ 
uling  [12][13],  which  was  first  employed  in  the  Alpha  realtime  decentralized  OS  kernel  [14JI15]. 

Benefit  Functions 

The  urgency — ^i.e.,  time  criticality — of  a  realtime  computation  is  expressed  in  terms  of  the 
benefit  it  provides  to  the  system  as  a  function^  of  the  time  at  which  the  computation  is  com¬ 
pleted  (see  Figure  5).  The  benefit  metric  is  application -specific  and  defined  system-wide.  Ben¬ 


efit  functions  are  derived  by  the  programmers  directly  from  the  requirements  and  behavior  of 
the  realtime  computation  (usually  an  application  activity);  this  is  subject  to  a  system-wide  en¬ 
gineering  process  (just  as  are  assignments  of  classical  priorities). 

The  function  ^  is  unimodal  if  it  is  concave  downward  (we  will  define  that  linear  functions 
are  so) — ^i.e.,  any  decrease  in  value  cannot  be  followed  by  an  increase — otherwise  it  is  multi¬ 
modal.  A  multimodal  function  has  at  least  one  instance  of  a  monotonic  decrease  in  value  fol¬ 
lowed  by  a  monotonic  increase,  and  thus  there  are  multiple  non-conciguous  time  intervals  when 
it  is  better  to  complete  the  computation  than  during  the  times  separating  them  (see  Figure  6). 
A  multimodal  function  involves  non-linear  optimization  which  is  often  intractable  on-line,  so 
we  do  not  discuss  multimodal  functions  further  here. 


Figure  6;  Multimodal  benefit  functions 


A  computation’s  benefit  function  can  be  changed  each  time  it  is  released  for  execution,  as 
illustrated  in  Figure  7. 

The  benefit  function  tune  axis  is  the  one  the  scheduler  uses.  It  may  be  physical,  either  abso¬ 
lute  (“calendar/wall  clock”)  time— i.e„  year,  month,  date,  hour,  minute,  second,  mSec,  pSec — 
or  relative  to  (since)  some  past  event.  Alternatively,  it  may  be  logical — e.g.,  a  number  which 
monotonically  increases,  but  not  necessarily  at  regular  intervals.  In  some  distributed  realtime 
computer  systems,  time  constraints  can  span  nodes,  which  requires  a  trans-node  time  frame 
(global  clock).  The  origin  of  the  benefit  function  axes  is  the  current  time  (value  of  the  system 
clock)  tc,  as  seen  in  Figure  8. 
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Figure  7:  A  computation  changing  benefit  functions  each  release 


It  may  be  preferable  for  an  application  programmer  to  express  some  benefit  functions  in 
terms  of  a  time  parameter  different  fix)m  that  of  the  global  time  axis — e.g.:  a  computation’s 
deadline  being  incremental  time  units  from  now,  regardless  of  the  axis  metric;  or  a  particular 
physical  absolute  time,  though  the  axis  is  physical  relative  time — but  these  differences  must 
subsequently  be  translated  for  scheduling.  Translations  between  physical  and  logical  time 
frames  are  ordinarily  infeasible. 

Expressing  a  benefit  function  relative  to  a  future  time/  event,  such  as  the  completion  of  some 
other  computation,  or  an  external  signal,  is  adding  a  (generally  dynamic)  dependency  to  the 
time  constraint  Dependencies  must  be  accommodated  in  conjunction  with  time  constraints  ac¬ 
cording  to  some  specific  scheduling  policy,  and  thus  are  not  part  of  the  Benefit  Accrual  Model 
perse. 

The  earliest  time  for  which  a  benefit  function  is  defined  is  called  its  initial  time  ti;the  latest 
time  for  which  a  benefit  function  is  defined  is  called  its  terminal  time  ty  (see  Figure  8).  Some 
systems  and  scheduling  algorithms  call  for  the  specification  of  an  indefinitely  extended  termi¬ 
nal  time.  A  benefit  function  is  evaluated  only  for  values  of  its  time  parameter  between  the  cur¬ 
rent  time  and  its  terminal  time.  If  the  terminal  time  is  reached  (tj  “  t^)  and  execution  of  the  re¬ 
altime  computation  has  not  begun  or  has  begun  but  not  completed,  the  realtime  computation  is 
aborted  and  the  time  constraint  is  removed  fi-om  scheduling  consideration.  If  a  realtime  com¬ 
putation  is  sufficiently  likely  to  complete  execution  after  its  initial  time,  a  scheduling  algorithm 
could  choose  to  begin  it  before  the  initial  rim«> 


Figure  8:  Initial  and  terminal  times 


The  later  time  II  (see  Figure  9)  is  that  after  which  the  benefit  function  value  is  (monotonical- 
ly)  non-increasing;  thus,  completing  the  realtime  computation  at  or  after  this  time  is  better.  A 
benefit  function  always  has  a  later  time.  The  sooner  time  ts  is  that  after  which  the  benefit  func¬ 
tion  value  is  (monotonically)  decreasing;  thus,  completing  the  realtime  computation  at  or  be¬ 
fore  this  time  is  better.  A  benefit  function  need  not  have  a  <  tr.  If  its  value  becomes  zero  or 
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negative  at  time  tc  ^  t,.  a  benefit  .unction  has  an  expiration  time. 


Figure  9:  Later,  sooner,  and  expiration  times 


It  can  be  necessary  for  a  realtime  computation  to  be  completed  at  a  time  yielding  zero  or  neg¬ 
ative  benefit;  early,  rather  than  delaying  execution  until  the  greatest  positive  benefit  is  expect¬ 
ed;  or  tardy,  rather  than  terminating  (or  not  initiating)  execution  after  there  is  no  expectation  of 
positive  benefit.  Such  cases  arise  due  to  dynamic  dependencies,  when  a  computation;  has  been 
initiated  and  cannot  be  stopped  (preempted  or  aborted)  or  undone  (such  as  one  related  to  a 
physical  activity  in  the  application  environment);  or  would  block  another  if  not  completed,  de¬ 
spite  its  consequential  zero  or  negative  benefit. 

A  special  case  of  a  sooner  time  ts  is  a  due  time  to,  distinguished  by  the  benefit  function’s  first 
derivative  having  an  infinite  discontinuity  at  ts  ~  (shown  in  Figure  10).  A  deadline  is  a  due 
time  subject  to  a  collective  temporal  acceptability  criterion  which  does  not  allow  the  due  time 
to  be  missed. 

A  benefit  function  is  defined  as  hard  if  it  has;  a  zero  or  constant  negative  value  before  1^,  an 
infinite  discontinuity  in  its  first  derivative  at  ti,  if  ti,  >  t,;  a  due  time  a  constant  value  between 
ti.  and  ti,;  and  a  constant  value  between  and 
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Figure  10:  Example  hard  benefit  function 


The  most  common  meaning  of  a  classical  “hard  deadline” — a  computation  which  completes 
anytime  between  its  initial  and  deadline  times  is  uniformly  acceptable,  and  otherwise  is  unac- 
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ceptably  tardy — corresponds  in  this  model  to  a  hard  benefit  function  with  deadline  to  -  tj 
unit  binary  range  {0,1 }  (Figure  1 1).  Qassical  definitions  of  “hard  deadline”  vary  a  little:  they 

1 

b 

0 

t  i 

Figure  11:  Hard  deadline  benefit  function 

generally  do  not  provide  for  a  II  >  t,;  sometimes  the  range  of  this  function  is  {-w.i };  a  few  al¬ 
gorithms  define  the  range  as  <0,  ke),  whwe  e  is  the  computation’s  execution  duration  and  k  is 
a  proportionality  factor,  many  systems  allow  phases  within  each  period  to  be  arbitrary,  while 
others  require  all  the  phases  to  be  synchronized;  most  deterministic  algorithms,  such  as  rate- 
monotonic,  require  the  highest  priority  ready  activity  to  execute,  thus  disallowing  phase  shifts. 

All  benefit  functions  which  are  not  hard  are  soft.  Soft  benefit  functions  can  have  arbitrary  val¬ 
ues  before  and  after  the  optimal  value  at  tj  (Figure  12);.  they  need  not  have  constant  values  on 


b 


Figure  12:  Example  soft  benefit  function 
each  side  of  ti,  and  tn,  or  expiration  times  (Figure  13). 


Figure  13:  Example  soft  benefit  functions 

A  time  constraint — and  thus  benefit  function — ^is  niari^.  known  to  the  scheduler  at  its  release 
time  (which  is  usually  a  scheduling  event). 

When  the  benefit  function  is  released,  its  initial  time  may  be  either  the  current  time — ^the  tim^ 
constraint  is  released  at  the  time  it  is  to  take  effect  (Le.,  at  t,  -  tc) — or  a  future  tim«» — the  tim#» 
constraint  is  released  in  advance  (i.e.,  tj  >  tc)  to  improve  scheduling  (but  t]  ^  tc  is  a  necessary 
condition  for  the  computation  to  complete,  if  not  also  begin,  execution).  See  Figure  14. 
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Figure  14:  Tiie  initial;  time  may  be  either  the  current  time  or  a  future  time 

Expressing  or  releasing  a  benefit  function  relative  to  a  future  time/event,  such  as  the  comple¬ 
tion  of  some  other  computation  or  an  external  signal,  is  adding  a  (generally  dynamic)  depen¬ 
dency  to  the  time  constraint.  Dynamic  dependencies  can  require  a  realtime  computation  to  be 
completed  at  a  time  yielding  zero  or  negative  benefit — ^for  example,  when  a  computation;  has 
been  initiated  and  cannot  be  stopped  (preempted  or  aborted)  or  undone  (such  as  one  related  to 
a  physical  activity  in  the  application  environment);  or  would  block  another  if  not  completed, 
despite  its  consequential  zero  or  negative  benefit  Dynamic  dependencies  can  require  indefi¬ 
nitely  extended  function  terminal  times.  Dependencies  must  be  accommodated  in  conjunction 
with  time  constraints  according  to  some  specific  scheduling  policy,  and  thus  are  not  part  of  the 
benefit  accrual  model  per  se. 

Importance 

Each  computation  generally  also  has  a  relative  importance — i.e.,  functional  criticality — with 
respect  to  other  computations  contending  for  completion.  Importance  is  orthogonal  to  urgency: 
a  computation  with  high  urgency  (e.g.,  a  near  deadline)  may  not  be  highly  important;  or  a  com¬ 
putation  with  low  urgency  (e.g.,  a  far  deadline)  may  be  very  important 

Importance  may  be  a  function /i  of  time  and  other  parameters  that  reflect  the  application  and 
computing  system  state,  and  can  be  represented  and  employed  similar  to  urgency  (Figure  15). 


Figure  15:  Importance  function 

In  simple  cases,  importance  may  be  a  constant,  and  urgency  (benefit)  may  be  simply  scaled 
by  importance— e.g.,  by  multiplication,  addition,  or  concatenation.  In  more  general  cases 
where  importance  needs  to  be  a  variable,  and  j|  must  be  evaluated  together  dynamically  to 
determine  the  benefit— c.g.,  as  some  function  of  the/,  and/,  functions,  g(/B,/).  See  Figure  16. 

Execution  Duration 

A  realtime  computation  has  an  execution  duration  e  which  the  scheduler  usually  has  some 
information  about  prior  to  execution.  This  information  can  be  either  known  deterministically 
(the  most  common  presumption),  or  estimated.  Most  estimates  are  stochastic  (known  in  expec¬ 
tation),  but  alternatively  may  be  non-stochastic— e.g.,  bounds  or  rules.  Execution  duration  in- 
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formation  may  or  may  not  take  into  account  a  forecast  of  dynamic  dependencies.  Non-deter- 
Tninigtir  durations  may  be  estimated  dynamically  (during  the  computation’s  execution) — e.g., 
conditional  probability  distributions,  or  execution-time  knowledge-dnven  rules. 


Benefit  Accrual  Functions 

The  scheduler  considers  all  released  time  constraints  between  the  current  time  and  its  horizon 
Ik— the  future-most  terminal  time  (Figure  17).  It  assigns  the  estimated  execution  completion 
times,  and  consequently  the  initiation  times  and  ordering,  for  those  computations  using  an  al¬ 
gorithm  which  seeks  to  sufficiently  satisfy  the  scheduling  (collective  temporal  acceptability) 
criterion  (such  as  earliest-deadline-first  for  the  classical  “hard  realtime”  criterion  of  all  compu¬ 
tations  meeting  their  deadlines).  The  algorithm  should  also  take  into  account  dependencies  and 
importances. 


Figure  17:  The  scheduler  considers  all  released  benefit  functions  to  its  horizon 

It  is  feasible  to  schedule  a  particular  set  of  realtime  computations  if  its  collective  temporal 
acceptability  criterion  can  be  sufficiently  satisfied.  A  particular  set  of  realtime  computations  is 
schedulable  if  there  exists  at  least  one  algorithm  which  can  feasibly  schedule  it  A  scheduling 
algorithm  is  optimal  if  it  always  produces  a  feasible  schedule  whenever,  in  the  static  case,  any 
other  algorithm  can  do  so;  in  the  dynamic  case,  a  static  algorithm  with  complete  H  priori  knowl¬ 
edge  would  do  so. 

The  ideal  case  of  every  computation  always  completing  execution  at  an  optimum  time  is  un¬ 
realistic  in  general.  Even  though  the  traditional  “hard  realtime"  cases  are  intended — and  com¬ 
monly  imagined — to  achieve  this  ideal,  physical  laws  (especially  in  asynchronous  decentral¬ 
ized  systems)  or  the  intrinsic  nature  of  the  applications  (especially  at  mission  management  lev¬ 
els)  generally  make  it  non-cost-effective  or  even  impossible. 

Most  actual  realtime  systems  desire  a  sufficient  number  of  computation  completion  times  to 
be  sufficiently  hkely  to  be  sufficiently  acceptable  (perhaps  optimk)  under  the  current  applica¬ 
tion  and  computer  system  circumstances. 

For  the  special  case  of  any  collective  temporal  acceptability  critmion  defined  to  be  a  unani¬ 
mous  optimum  of  the  individual  temporal  acceptabilities,  there  is  an  equivalent  criterion  de- 
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fined  in  tenns  of  individual,  rather  than  collective,  optimums — e.g.,  meet  all  deadlines  means 
meet  each  deadline,  and  maximize  all  benefits  means  maximize  each  benefit 

In  general,  collective  temporal  acceptability  is  not  defined  as  necessarily  unanimous  or  opti¬ 
mum  with  respect  to  the  individual  computations’  temporal  acceptability — e.g.,  minimize  the 
number  of  missed  deadlines,  or  maximize  the  sum  of  the  benefits. 

In  the  Benefit  Accrual  Model,  collective  temporal  acceptability  criteria  are  based  on  accruing 
benefit  from  the  individual  computations  in  a  set,  in  a  maimer  specified  by  a  benefit  accrual 
function  for  that  set.  This  is  general  enough  to  encompass  a  wide  range  of  temporal  acceptabil¬ 
ity  criteria — e.g.:  the  optimum  cases  such  as  traditional  “hard  realtime,”  for  which  the  function 
is  the  product  of  the  individual  benefits  (assuming  the  usual  range  of  {0, 1 });  potentially  sub¬ 
optimum  cases,  for  which  example  functions  are  to  maximize  the  sum  (mean,  etc.)  of  the  indi¬ 
vidual  benefits;  the  number  of  computations  during  a  time  frame  T  which  achieve  at  least  P  per¬ 
cent  of  their  maTimutn  possible  benefit;  the  probability  that  at  least  P  percent  of  the  computa¬ 
tions  during  a  time  frame  T  will  achieve  their  maximum  benefits.  Collective  temporal  accept¬ 
ability  criteria  can  be  employed  for  scheduling  or  performance  evaluation. 

4  Best-Effort  Scheduling 

Introduction 

Scheduling  principles  and  practices  which  are  realtime  by  our  definition  (i.e.,  based  on  satis¬ 
fying  completion  time  constraints)  have  until  recently  been  focused  exclusively  on  guarantee¬ 
ing  that  a  unanimous  optimum  scheduling  criterion  will  be  met  (e.g.,  the  classical  “hard  real- 
time”  case  of  guaranteeing  that  all  deadlines  are  always  met).  Even  though  the  traditional  “hard 
realtime”  cases  are  intended — and  commonly  imagined — to  achieve  this  ideal,  physical  laws 
(especially  in  decentralized  systems)  or  the  intrinsic  nature  of  the  applications  (especially  at 
mission  management  levels)  generally  make  it  either  non-cost-effective  or  impossible  (there 
are  only  a  few  exceptions). 

In  general,  realtime  systems  need  a  sufficient  number  of  computation  completion  times  to  be 
sufficiently  likely  to  be  sufficiently  acceptable  (perhaps  optimal),  given  the  current  application 
and  computer  system  circumstances  (perhaps  over  a  wide  range  of  such  circumstances)— 
where  each  instance  of  “sufficient”  is  application-specific. 

The  Benefit  Accrual  Model  provides  a  framework  for  expressing  “softer”  timp,  constraints — 
in  the  sense  of  non-binary  completion  time  acceptability — and  scheduling  criteria — in  the 
sense  of  non-unanimous  and  non-optimum.  It  accomplishes  this  in  addition  to — ^and  in  the 
same  manner  as — ^the  conventioiud  “hard”  time  constraints  and  scheduling  criteria.  These  soft¬ 
er  needs  are  realized  with  best-effort  scheduling  algorithms 

Best-effort  (BE)  realtime  scheduling  algorithms  aggressively  seek  to  provide  the  “best” — as 
specified  by  the  application —  computational  timeliness  they  can,  given  the  current  application 
and  computer  resource  conditions.  Best-effort  resource  management  is  generally  heuristic — a 
familiar  approach  at  the  application  levels  (most  conspicuously  in  artificial  intelligence,  pattern 
•  recognition)  and  less  visibly  at  the  system  software  levels.  Because  heuristics  are  essentially 
foreign  in  traditional  realtime  systems,  we  employ  the  term  “best  effort”  to  more  clearly  evoke 
our  intended  departure  in  philosophy — analogous  to  the  utilization  of  the  term  “guess”  [16]  for 
inferences  performed  by  certain  intelligent  user  interfaces,  e.g.,  [17]. 

Heuristics  in  general,  and  best  effort  realtime  resource  management  in  particular,  involve 
trade-offs  of  risk  management  and  situational  coverage.  Best-effort  on-line  retime  scheduling 
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heuristics  currently  offer  empirically- based  high  confidence  that  acceptable  computational 
timeliness  will  be  achieved  over  a  broad  range  of  conditions;  but  with  no  or  low  formal  bounds 
on  guaranteed  timeliness  (note  that  this  is  necessarily  always  true  of  the  humans  currently  per¬ 
forming  best-effort  resource  management).  Conversely,  traditional  “hard”  realtime  scheduling 
algorithms  provide  formal  guarantees  of  optimal  computational  timeliness  under  extremely  re¬ 
stricted — generally  unrealistic — conditions,  but  behavior  which  is  unknown  or  known  to  be 
pathologically  wrong  outside  those  conditions.  Examples  of  applications  which  seem  to  call 
naturally  for  each  of  these  extremes  come  immediately  to  mind — but  in  making  the  trade-offs 
and  compromises  to  find  an  application-specific  appropriate  middle  ground,  one  must  beware 
of  the  human  trait  to  undervalue  the  reduction,  as  opposed  to  the  elimination,  of  risks  [18]. 

Overview  of  Best-Effort  Realtime  Scheduling  Work 

This  concept,  and  the  Time-Value  Function  progenitor  of  the  Benefit  Accrual  Model  as  a 
framework  for  expressing  time  constraints,  were  originated  by  Jensen  [12][13].  The  first  gener¬ 
ation  of  BEr— on-line  (at  execution  time) — scheduling  algorithms  emerged  fi-om  Jensen’s  Ar- 
chons  Project  at  CMU  [19],  for  the  Alpha  realtime  decentralized  OS  kernel  [14]:  Locke’s  algo¬ 
rithm  [7]  and  Clark’s  algorithm  [8]. 

Locke’s  algorithm  allows  a  wide  variety — but  not  all  forms— of  Time-Value  Functions 
(TVF’s).  Locke  intends  that  importance  be  reflected  by  scaling  the  TVF  values.  The  scheduling 
optimality  criterion  is  the  special  (but  reasonable)  case  of  maximizing  the  sum  of  the  job  values 
attained.  Execution  times  are  defined  stochastically. 

The  algorithm  schedules  jobs  Earliest-Deadline-First  (EDF)  since  that  is  optimal  when  under¬ 
loaded.  If  a  job  arrival,  or  execution  time  overrun,  results  in  a  sufficiently  high  probability  of 
overload,  jobs  are  set  aside  in  order  of  minimum  expected  value  density  (expected  value/ex¬ 
pected  remaining  execution  time)  until  the  probable  overload  is  removed. 

Locke’s  algorithm  does  not  address  dependencies  (e.g.,  precedence,  resource  conflicts). 

Locke  used  simulations  to  demonstrate  that  his  algorithm  performed  well  in  comparison  to 
others  for  a  number  of  interesting  overload  cases,  but  provided  no  formal  performance  charac¬ 
terizations.  Versions  of  Locke’s  algorithm  have  been  implemented  and  experimentally  verified 
to  be  superior  and  cost-effective  with  respect  to  traditional  realtime  scheduling  algorithms  for 
a  number  of  interesting  cases.  In  the  Alpha  realtime  distributed  OS  kernel,  these  included:  a  bat¬ 
tle  management  application  for  air  defense,  by  General  Dynamics  and  the  Archons  Projea  at 
CMU  [20];  aiid  a  ball-and-paddle  realtime  scheduling  evaluation  testbed  by  the  Archons  Project 
[21];  the  Alpha  version  also  added  nested  time  constraints  and  timeliness  failure  abort  process¬ 
ing.  Locke’s  algorithm  was  implemented  In  the  Mach  2.5  OS  kernel,  and  measured  on  a  syn¬ 
thesized  realtime  workload  by  the  Archons  Project  [22]. 

Clark’s  algorithm  makes  a  major  contribution  by  dealing  with  dependencies  (e.g.,  prece¬ 
dence,  resource  conflicts)  which  are  not  known  in  advance.  It  employs  the  same  scheduling  op¬ 
timality  criterion  as  Locke’s.  Qark  permits  only  rectangular  TVF’s,  whose  value  is  the  job’s 
importance.  Job  execution  times  are  both  fixed  and  known. 

Clark’s  algorithm  selects  jobs  to  be  scheduled  in  decreasing  order  of  value  density  (VD),  and 
then  selected  jobs  are  scheduled  EDF— for  the  TVF’s  he  allows,  this  both  meets  all  deadlir.ss 
and  maximizes  summed  value.  When  each  job  is  scheduled,  so  are  those  on  which  it  depends. 
If  necessary,  precedent  jobs  are  aborted  or  their  deadlines  are  shortened  (whichever  is  faster), 
to  satisfy  the  deadline  of  the  dependent  job. 

Clark  used  formal  analysis  and  simulations  to  show  that  when  overloaded,  if  the  algorithm 
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can  apply  all  available  cycles  to  jobs  that  complete,  no  other  algorithm  can  accrue  greater  value 
given  the  current  knowledge;  but  since  future  jobs  are  unknown,  there  is  no  performance  guar¬ 
antee.  Clark’s  algorithm  is  being  implemented  in  both  the  Alpha  and  Mach  3  OS  kernels. 

A  second  generation  of  on-line  BE  algorithms  is  being  devised  as  part  of  a  recent  multi-uni¬ 
versity  effon  to  establish  formal  performance  bounds  for  on-line  algorithms  in  general  and  cer¬ 
tain  BE  ones  in  particular  [23][24][25].  Their  work  is  focused  on  the  competitive  factor,  which 
measures  the  value  an  algorithm  guarantees  it  will  achieve  compared  to  a  clairvoyant  schedul¬ 
er. 

Like  Qark’s,  their  algorithms  allow  only  rectangular  TVF’s,  and  (mostly)  require  both  fixed 
and  known  execution  times. 

The  principal  result  is  that  if  all  values  are  proportional  to  execution  rime,  an  on-line  algo¬ 
rithm  can  guarantee  a  competitive  factor  of  no  more  than  1/4.  The  performance  bound  is  lower 
when  value  is  not  proportional  to  execution  time,  or  the  ratio  of  maximum  to  minimnrp  vd  in¬ 
creases,  or  execution  times  are  not  fixed  and  known. 

This  confirms  the  intuition  that  realtime  performance  guarantees  are  impossible  if  workload 
characteristics  are  unknown.  However,  the  most  recent  research  suggests  that  acceptable  per¬ 
formance  assurances  may  be  possible  when  limited,  reasonable,  workload  information  is 
known;  learning  and  understanding  such  trade-offs  is  one  of  the  most  important  advances  still 
to  be  made  for  BE  algorithms. 

Maynard’s  thesis  [26]  is  improving  the  understanding  of  the  overload  behavior  of  on-line  re¬ 
altime  scheduling  algorithms,  and  developing  techniques  for  defining  benefit  functions  to  yield 
desired  overload  behavior.  Its  scope  includes  BE  schedulers  that  use  benefit  density  as  the  load 
shedding  criterion.  The  work  to  date  provides  an  algorithm  for  setting  job  importance  values 
to  impose  a  strict  priority  ordering  among  selected  groups  of  jobs.  This  allows  integration  of 
results  firom  off-line  schedulability  analysis,  to  both  provide  “guarantees”  when  necessary  and 
possible,  and  retain  the  adaptability  of  dynamic  scheduling.  His  simulations  support  the  valid¬ 
ity  of  the  approach.  He  is  also  creating  tools  which  help  the  system  designer  select  and  adapt 
suitable  scheduling  algorithms  for  specific  applications,  and  choose  appropriate  job  importance 
values. 

The  most  closely  related  work  to  BE  realtime  scheduling  is  Cost-Based  Scheduling  for  queue¬ 
ing  and  dropping  network  packets  [27],  In  this  conrext,  a  cost  function  specifies  the  cost  per  unit 
length  of  queuing  delay  for  a  packet  as  a  function  of  time.  Packets  are  limited  to  non-decreasing 
cost  functions. 

Unlike  BE  processor  scheduling,  which  create  a  whole  schedule,  the  cost-based  network  al¬ 
gorithm  queues  the  next  packet  which  it  estimates  would  cost  the  most  to  delay.  Cost  is  calcu¬ 
lated  using  a  estimation  of  future  cost  that  would  be  incurred,  which  is  the  for  all  packets. 
The  optimization  objective  is  to  minimize  the  average  delay  cost  incurred  by  all  packets.  De¬ 
pendencies  are  not  considered,  but  explicitly  recognized  as  critical. 

Their  simulations  show  that  the  algorithm  pmfonns  well  compared  to  the  standard  packet 
queuing  algorithms,  and  to  Locke’s  algorithm,  for  packets  averaging  unit  length,  in  near  fully 
loaded  conations.  The  premises  of  this  work  do  not  correspond  well  to  the  workload  charac¬ 
teristics  of  interest  for  best-effort  realtime  computation  schooling. 

In  addition  to  this  on-line  research,  a  first  generation  of  off-line  BE  algorithms  is  being  de¬ 
vised  in  France  [28][29]. 

Benefit  functions  employ  more  application-supplied  information,  and  thus  exact  a  higher 
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computational  price  than  when  little  such  information  is  used  (e.g.,  by  static  priority)  or  no  in¬ 
formation  is  used  (e.g.,  by  round  robin).  Best-effort  realtime  scheduling  algorithms  utilize 
more  application-supplied  information  than  is  usual,  and  place  specific  requirements  on  the 
kind  of  scheduling  mef^hanisms  that  must  be  provided  (i.e.,  in  the  OS  kernel — cf.  those  of  the 
Alpha  kernel). 

These  prices  can  be  minimized  by  good  engineering,  and  then  paid  in  different  ways,  includ¬ 
ing  with  inexpensive  hardware:  higher  performance  processors;  a  dynamically  assigned  pro¬ 
cessor  in  a  multiprocessor  node;  or  a  special-purpose  hardware  accelerator  (analogous  to  a 
floating-point  co-processor)  in  a  uniprocessor  or  multiprocessor  node. 
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1.  INTRODUCTION 

Too  often,  designers  argue  about  merits  of  their  favorite  approach,  about  drawbacks  of  other 
approaches,  without  addressing  the  fundamental  underlying  issue  : 

what  is  the  rationale  for  adopting  a  particular  design  approach,? 

System  design  can  be/should  be  as  rigorous  as  demonstrations  in  Mathematics.  There  are  obvious 
analogies  shown  table  1. 

Too  often,  careful  examination  of  published  solutions/methodologies  reveal  logical 
contradictions.  For  example,  design  solutions  are  antagonistic  with  what  should  be  the 
appropriate  computational  model  (soiiietimes,  this  happens  because  the  model  is  not  even 
described  !).  Two  examples  ; 

-  the  priority  ceiling  protocol  is  non-sense  in  the  context  of  distributed  systems  ;  one  of  the 
reasons  is  that  it  rests  on  the  assumption  that  processes  may  observe  the  global  system  state 
(e.g.  via  a  global  semaphore)  which  is  known  to  be  unfeasible  in  distributed  systems 

-  the  IEEE  802.4  (token-bus)  LAN  standard  is  based  on  a  violation  of  the  design  premises  ; 
collisions  are  ruled  out,  therefore  the  token-passing  solution  ;  nevertheless,  designers  had  to 
provide  for  collision  detection  and  resolution  ~  because  of  token  loss  and  virtual  ring 
reconfiguration  ;  doing  something  equivalent  in  Mathematics,  i.e.  demonstrating  a  theorem 
that  violates  an  axiom,  is  very  embarassing  ! 

(Careful  examination  may  also  reveal  tautologies.  (Consider  the  atomic  broadcast  problem.  Atomic 
broadcast  guarantees  that  non-faulty  processors  see  identical  histories  of  (possibly  concurrent) 
broadcasts,  i.e.  no  loss  and  identical  orderings.  The  ISIS  ABCAST  protocol  is  supposed  to  be  a 
solution.  In  fact,  this  protocol  is  incorrect  if  the  underlying  (physical)  broadcast  subsystem  is  not 
atomic  itself !  So,  what  is  added  by  ABCAST,  besides  oveiiiead  ? 

Finally,  careful  examination  may  reveal  severe  design  flaws.  A  very  important  example  is  that  of 
distributed  scheduling  based  on  fixed  (computed  off-line)  priorities.  This  naive  —and  erroneous- 
approach  is  derived  from  the  rate  monotoiuc/deadline  monotonic  methodologies.  These 
methodologies  are  sound  and  should  be  used  in  their  intended  context  whenever  appropriate. 
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namely  single  prcx:essor  fault-free  architectures  running  periodic  (or  sporadic)  tasks, 
characterized  with  simple  timeliness  attributes  (i.e.  deadlines,  all  tasks  have  the  same  value).  In 
the  context  of  such  simple  computational  models,  timeliness  attributes  can  be  rigorously 
transformed  into  time-independent  (fixed)  priorities,  using  the  RM  or  DM  methodologies. 

However,  there  is  no  equivalent  methodology  for  distributed  architectures  (i.e.  richer,  more 
complex,  but  also  more  realistic  computational  models).  Hence,  in  the  case  of  real-time 
distributed  systems,  designs  and  solutions  that  are  based  on  fixed  priorities  are  barely  interesting, 
because  the  following  (very  difficult)  issue  is  left  open  :  how  to  transform  distributed  task 
timeliness  attributes  into  integers  so  that  the  set  of  specified  system-wide  timing  constraints  are 
provably  met  ? 

A  real-time  distributed  system  designer  who  would  use  fixed  priority  based  scheduling  has  no 
means  to  demonstrate  that  he  is  solving  the  originally  stated  problem.  Who  would  trust  the 
resulting  system  ? 

Let  us  be  clear.  We  are  not  claiming  that  the  rigorous  designfimplementation  cycle  shown  table  1 
is  always  feasible.  There  are  instances  of  system  objectives/requirements  whose  complexity  raises 
design  problems  that  are  beyond  the  reach  of  current  state-of-the-an  in  Computer  Science.  When 
faced  with  such  problems,  we  are  forced  to  use  «pragmatic»  approaches  (e.g.  heuristics,  testing  in 
lieu  of  proofsA'aJidation,  etc.),  until  state-of-the-an  matures. 

However,  it  is  our  view  that  too  often,  designs  are  poorly/erroneously  conducted,  not  because 
state-of-the-an  is  limited  but  because  state-of-the-an  is  ignored 

2.  RATIONALE  FOR  A  CORRECT  COMPUTATIONAL  MODEL  IN  THE 
CASE  OF  LDPART  SYSTEMS 

Our  purpose  is  to  show  how  one  can  arrive  at  a  rigorous  identification  of  a  correct  computational 
model  intended  for  designing  Large  Distributed,  Parallel  Architecture  Real-Time  (LDPART) 
Systems. 

We  proceed  by  identifying  the  implications  (noted  I)  of  the  stated  system  objectives/requirements. 

I]  ;  distribution/parallelism  concurrency 

Synchronous  or  asynchronous  concurrency  ? 

12  :  large  (systems)  — » probably  asynchronous  concurrency 

13  ]  :  real-time  — »  fault-tolerance 

Faults  are  stochastic  events.  Hence,  I3.1  combined  with  I]  yields  a  confirmation  of  I2. 

=>  Conclusion  #  1  ;  asynchronous  concurrency 

In  passing,  let  us  notice  the  following  reciprocal  implications  : 

fault-tolerance  -» redundancy 

fault-tolerant  redundancy  management  distribution/parallelism 


1^2  '  real-time  -4  scheduling 

Off-line  or  on-line  scheduling  ? 

Conclusion  #1  precludes  clairvoyance  (full  knowledge  of  the  future  system/environment  histories 
such  as  external  event  arrival  laws,  inter-task  conflicts,  etc.)- 

=»  Conclusion  #2  ;  on-line  scheduling. 

Let  us  now  examine  the  proofA^alidation  issue.  Simply  stated,  there  are  two  schools  of  thought,  or 
two  major  design  approaches. 

One  of  them,  called  the  static  design  approach  (S),  rests  on  clairvoyance  assumptions.  Critical 
attributes  of  a  LDPART  system  constitute  a  multidimensional  space  [external  event  arrival  laws, 
task  durations,  fault  «anival  laws»,  internal  event  (e.g.  message)  arrivals  laws,  shared  resource 
conflict  patterns,  task  time  dependent/time  independent  values,  etc.].  ProofsA^alidation  established 
for  design  solutions  based  on  clairvoyance  assumptions  are  valid  only  for  a  particular  point  (at 
best,  a  small  region)  in  this  multidimensional  space.  In  other  words,  the  coverage  factor  of  such 
proofs/validation  is  not  very  good. 

The  other  one,  called  the  dynamic  design  approach  (D),  is  consistent  with  the  widely  accepted 
fact  that  clairvoyance  assumptions  are  unr^istic  in  general.  It  can  be  demonstrated  that 
clairvoyance  assumptions  are  folly  antagonistic  with  the  very  nature  of  LDPART  systems.  Proofs/ 
validation  can  indeed  be  established  for  design  solutions  based  on  zero  or  limited  a  priori 
knowledge.  The  price  incurred  to  establish  (more  complex)  proofs  compared  with  demonstrating 
properties  under  a  S  approach  is  paid  only  once,  with  the  very  interesting  result  that  such  proofs/ 
validation  are  valid  throughout  the  entire  multidimensional  space  (zero  a  priori  knowldege)  or 
most  of  it  (limited  a  priori  knowledge).  In  other  words,  the  coverage  factor  of  such  proofs/ 
validation  is  very  close  or  equal  to  one  [1]. 

Examination  of  current  state-of-the-art  in  Computer  Science  indicates  that  proofs/validation  exist 
today  for  many  D  design  solutions,  the  only  ones  to  match  the  basic  nature  of  LDPART  systems. 
We  believe  that,  as  time  goes  by,  more  proofs/validation  will  become  available  for  D  design 
solutions,  thus  relegating  some  of  the  S  design  solutions  to  the  precambrian  era. 

Let  us  give  examples  of  design  solutions  that  have  a  poor  coverage  factor  in  the  context  of 
LDPART  systems.  The  fist  example  is  concerned  with  those  real-time  systems  that  are  based  on 
the  premise  that  external  events  should  be  looked  at  only  when  appropriate,  i.e.  periodically 
sampled,  this  being  thought  to  be  absolutely  required  to  prove  timeliness  properties.  In  fact,  what 
this  really  means  is  that  such  an  extraordinaiy  assumption  is  made  to  ease  the  designer’s  job  !  It 
looks  like  the  question  of  whether  the  environment  really  behaves  periodically  is  of  minor 
importance. 

The  second  example  is  linked  with  the  first  one.  A  priori  knowledge  of  periodical  patterns  is  then 
used  to  «solve»  the  shared  bus  multiaccess  problem,  using  a  very  conventional  solution  that  is 
Static  Synchronous  Hme  Division  Multiplexing  [allocation  of  bus  slots  to  (periodical)  messages 
is  pre-computed  off-line].  The  construction  of  a  reliable  distributed  SS-TDMA  bus  precisely 
raises  the  fundamental  issue  of  how  to  synchronize  entities  (bus  attachment  units)  without 
resorting  to  centralized  control.  This  issue  does  not  seem  to  be  considered  worth  addressing  either 


by  those  who  believe  that  it  is  «simple  enough®  to  be  tackled  «at  the  hardware  level®.  They  are 
wrong.  Synchronization  is  an  essential  algorithmic  issue,  even  at  the  hardware  level  [2].  Now  the 
fallacy  :  SS-TDMA  bus  based  systems  are  publicized  as  «distributed  systems®.  Those  who 
believe  in  such  designs  fool  theu"  audience  —or  try  do  so—  and  maybe  fool  themselves,  for  the 
most  sincere  of  them,  by  calling  «real-time  distributed  system®  what  ,  ms  out  to  be  just  a 
synchronous  asymmetrical  wait-state  free  multiprocessor.  They  do  not  realize  that  with  this  type 
of  design,  they  can  only  mimic,  with  a  digital  technology,  what  used  to  be  called  andog 
computers ! 

Let  us  now  summarize.  The  examination  of  the  proofs/validation  issue  reveals  that  D  design 
solutions  have  coverage  factors  greater  than  those  resulting  ttom  S  design  solutions.  This  is 
cor  sistent  with  conclusion  #1,  which  precludes  clairvoyance. 

We  thus  conclude  that  the  correct  computational  model  to  reason  about,  to  design  and  to  prove 
properties  of  LDPART  systems  is  the  model  of  asynchronous  concuirent.  on-line  scheduled. 
computations. 

3.  MAJOR  ISSUES 

3.1.  Lack  of  clairvoyance  does  not  preclude  proving 

With  our  approach,  the  four  major  issues  are  mutually  related.  We  will  thus  present  how  we 
proceed  to  design  and  validate  LDPART  systems.  Positions  and  recommendations  are  clcarlv 
pointed  out  in  the  sequel. 

Let  us  first  mention  that  w.r.t.  formal  models  and  logics,  we  have  no  special  recommendation,  as 
there  is  no  consensus  on  useful  models  of  distributed/parallel  systems  and  logics  for  reasoning 
about/proving  their  properties,  this  being  especially  true  when  considering  timeliness  properties. 
This  is  an  example  of  an  area  where  state-of-the-art  does  not  offer  sufficiently  powerful  methods 
or  solutions  yet. 

In  order  to  conduct  a  validated  LDPART  system  design,  we  (not  surprisingly)  recommend  using 
our  general  computational  model  (see  above),  which  has  proved  to  be  adequate  to  tackle  all 
critical  issues  consistently,  essentially  (Concurrency  (Control  (Synchronization),  Real-Time 
Scheduling.  Global  Time  and  Fault-Tolerance.  A  design  model,  which  is  deriv^  fiom  this 
computational  model,  is  used  to  «instantiate»  those  algorithms  (solutions  to  the  critical  issues) 
that  are  proven  correct,  and  appropriate  for  a  particular  LDPART  system  under  consideration.  As 
will  be  shown,  this  design  model  essentially  is  an  object  oriented  transactional  model.  Our 
approach  is  based  on  proving  properties  of  individual  algorithms  (solutions),  on  proving 
properties  of  algorithm  composition,  this  being  done  within  the  framework  given  in  section  2,  i.e. 
assuming  only  partial  knowledge  or  zero  knowledge  of  the  future. 

We  refute  the  conventional  precambrian  allegation  that  proving  is  impossible  if  one  is  not 
clairvoyant.  Convincing  arguments  in  favor  of  on-line  (non  clairvoyant)  algorithms  can  also  be 
found  in  [3]. 


Separately,  when  considering  a  particular  architecture  for  implementation,  those  variables  found 
in  the  (proven)  solutions  can  be  valued,  thus  yielding  the  numerical  values  of  such  measures  of 
interest  as  upper  bounds  on  response  times,  lower  bounds  on  global  time  precision,  lower  bounds 


on  task  throughput,  etc. 


Our  position  is  clearly  to  separate  design  and  dimensioning.  For  those  solutions  selected  during 
system  design,  we  use  established  proofs  of  we  prove  properties  of  interest,  under  the  form  of 
computable  functions  or  under  the  form  of  theorems.  Let  us  give  two  examples. 

The  serializability  theorem  established  for  well-formed  transactions  which  obey  the  2-phase 
locking  rule  is  an  example  of  a  (safety  oriented)  proof  established  with  zero-knowledge  of  the 
future.  Similarly,  we  have  proved  the  existence  of  finite  upperly  bounded  response  times  for 
shared  multiaccess  broadcast  networks,  in  the  absence  of  overloads,  without  assuming  a  priori 
knowledge  of  individual  node  traffic  panems  (contrary  to  what  is  needed  with  SS-TDMA  or  token 
passing  protocols).  The  corresponding  protocol,  called  Deterministic  Ethernet  or  CSMA-DCR 
(Deterministic  Collision  Resolution),  has  a  variant  called  DOD-CSMA/CD  (Deadline  Oriented 
Deterministic  CSMA/CD)  which  is  capable  of  handling  messages  that  have  deadlines.  We  have 
proved  that  DOD-CSMA/CD  is  an  optimal  on-Fne  distributed  scheduling  algorithm  for  periodic 
and  sporadic  message  arrivals.  This,  and  the  absence  of  overload,  are  the  only  clairvoyance 
assumptions  needed.  We  conjecture  that  an  optimality  proof  can  also  be  given  in  the  case  of 
overloads  and  for  aperiodical  arrivals. 

3.2.  A  design  model  for  LDPART  systems 

Execution  of  asynchronous  concurrent  computations  is  conducted  via  (Concurrency  Control  (CC) 
algorithms.  We  rely  on  the  Serializability  Hieory  and  on  related  CC  algorithms  to  prove 
properties  (safety  and  liveness  properties)  of  concurrently  executing  application  S/W  modules.  A 
real-world,  well  established,  incarnation  of  the  Serializability  Theory  is  the  Transactional  Model, 
which  turns  out  to  match  Object  Orientation  quite  well.  We  have  found  the  following  design 
model  to  be  very  appropriate  when  considering  LDPART  systems  : 

-  a  transaction  is  a  set  of  actions/threads  that  invoke/enter  objects  (to  access  computational 
resources) 

-  the  binding  between  actions/threads  and  objects  is  dynamic  (a  consequence  of  our  rejection  of 
clairvoyance) 

-  objects  are  characterized  with  three  attributes,  namely  multiplicity,  persistency,  access. 
Multiplicity 

•  unique  instantiation  (UI)  of  every  object,  no  assignment  problem 

•  multiple  instantiations  (MI)  of  every  object,  constituting  a  class  per  object  (e.g.  processors, 
multiple  copies  of  a  variable),  raising  an  assignment  problem 

Ecrsisicngy 

•  no  persistency  property  (NP),  for  such  objects  as  processors,  commuitication  links 

•  persistency  property  (P),  for  such  objects  as  mechanical  devices  (e.g.  physical  orientation  of  a 
robot  arm),  data  structures 


Access 


•  centralized  (C),  i.e.  global  knowledge  (GK)  is  available  (within  the  limits  of  what  is  possible 
when  assuming  no  clairvoyance) 

•  distributed  (D).  i.e.  only  partial  or  incon^lett  knowledge  (subsets  of  GK)  is  available. 
System-wise,  there  arc  two  possibilities  : 

•  single-object  systems  (e.g.  a  uniprocessor,  a  communication  channel)  (SOS) 

•  systems  comprising  multiple  (non  equivalent)  objects  (MOS) 

No  need  to  elaborate  on  the  well-known  advantages  of  the  Transactional  Model  and  Object- 
Orientation  w.r.t.  S/W  engineering  as  well  as  w.r.t.  the  existence  of  guaranteed  properties,  such  as 
ACID  properties  for  transactions  and  on-line  updatable  data  (All-or-nothing,  Consistency, 
Isolation,  Durability). 


On  the  contrary,  it  might  be  necessary  to  elaborate  on  the  following.  LDPART  systems  contain 
and  maintain  distributed  data  structures  that  are  updated  on-line.  Examples  arc  sensor/actuator 
status  tables,  environmental  dau,  internal  system  tables.  Hence  the  need  to  use  CC  algorithms. 


Of  course,  safety  or  Uvencss  properties  arc  not  equivalent  to  timeliness  properties.  Consequently, 


With  on-iin 
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33.  Distributed  control  of  asynchronous  concurrency  and  distributed  on-line 
scheduling 


The  design  model  given  above  can  now  be  used  to  reason  about  such  a  composition.  Table  2 
shows  a  condensed  taxonomy-oriented  view  of  the  issues,  for  the  UI  case  (essentially,  the  Ml  case 
is  obtained  by  adding  the  assignment  problem).  Table  2  can  be  illustrated  as  follows.  (HPF  stands 
for  Highest  Wority  First,  EDF  stands  for  Earliest  Deadline  First). 


Cl 

C2 

C3 

C4 

dl 

d2 

d3 

d4 


HPF/EDF  for  uniprocessors 

HPF  with  priority  inheritance  for  uniprocessors 

scheduling  over  asymmetrical  shared-memory  multiprocessors 

HPF  and  priority  ceiling  over  shared-memory  multiprocessors 

EDF  over  a  broadcast  communication  channel 

EDF  over  a  multiclient/single  file  server  system 

best-effon  scheduling  over  a  distributed  system 

EDF  +  2-phase  locking  +  deadline-based  deadlock  avoidance 


It  is  now  easy  to  address  the  combined  issues  of  CC  and  Scheduling.  Under  a  deadlock  prevention 
approach,  the  CC  issue  disappears.  One  is  left  with  the  problem  of  choosing  an  optimal  on-line 
distributed  scheduling  algorithm.  Under  a  deadlock  avoidance  or  a  deadlock  detection-resolution 
approach,  one  must  select  (X  and  Scheduling  algorithms  in  such  a  way  that,  for  any  given 
conflict  scenario  between  two  real-time  transactions  A  and  B,  both  algorithms  make  the  same 
decision  (A  precedes  B  or  vice-versa).  The  example  given  for  d4  illustrates  a  compatible 
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combination.  However,  this  correct  combination  is  not  optimal,  i.e.  it  cannot  handle  some 
scenarios  (e.g.  overloads)  that  can  be  accommodated  by  other  combinations. 

It  might  be  appropriate  to  concentrate  a  little  on  the  notion  of  optimality  for  on-line  distributed 
algorithms.  This  notion  is  built  upon  the  definition  of  optimality  for  on-line  algorithms.  An 
optimal  on-line  algorithm  has  the  smallest  achievable  competitive  ratio,  noted  r  (i.e.  the  smallest 
loss  compared  to  a  clairvoyant  algorithm).  Given  the  same  amount  of  limited  a  priori  knowledge, 
no  other  algorithm  can  dominate  an  optimal  algorithm. 

When  distribution  must  be  accounted  for,  global  knowledge  GK  that  is  given  a  priori  is  not 
accessible  to  an  algorithm  (as  is  the  case  with  centralized  on-line  algorithms).  Consequently,  some 
(distributed)  algorithm  must  be  used  to  «bridge  the  gap»,  i.e.  to  approximate  GK.  Assume,  for 
example,  that  broadcasting  is  free  of  charge  and  instantaneous.  Then,  distributed  schedulers  are 
always  cognizant  of  those  tasks  waiting  to  be  scheduled,  system-wide.  In  this  (ideal)  case, 
«distributed»  optimality  is  equivalent  to  «centralized»  optimality. 

Optimality  in  distributed  systems  corresponds  to  achieving  the  best  approximation  of  GK,  noted 
k,  i.e.  the  smallest  uncertainty  ratio  u  =  GK/k.  Consequently,  an  optimal  on-line  distributed 
algorithm  has  the  smallest  achievable  competitive  ratio  u*r. 

As  mentioned  before,  DOD-CSMA/CD  is  an  example  of  a  provably  optimal  on-line  distributed 
scheduling  algorithm  for  case  dj  (table  2),  for  the  computational  model  indicated. 

Case  di  (as  well  as  case  d2)  complexity  is  relatively  easily  tractable.  This  is  not  the  case  with 
cases  d3  and  d4.  Some  satisfactory  solutions  have  emerged  very  recently.  However,  the  design  and 
validation  of  LDPART  systems  will  not  be  a  mature  discipline  until  more  correct,  and  possibly 
provably  optimal,  solutions  become  available  for  case  d4. 

Also,  the  concept  of  competitive  ratio  should  be  refined,  so  as  to  yield  more  realistic  bounds  than 
those  establish^  assunung  an  all-knowing  adversary,  which  is  overly  pessimistic  in  many  real 
world  settings. 

Let  us  conclude  this  section  by  briefly  showing  what  can  be  achieved  under  our  approach.  Let  us 
use  the  DOD-CSMA/CD  example  again.  Without  any  a  priori  knowledge,  it  is  possible  to  prove 
that  under  worst-case  conditions  (a  global  collision  among  n  contenders),  it  takes  at  most  T  to 
transmit  all  messages  successfully,  with 


T  = 


(21og2F-  l)']s-f  a  , 
J  n 


where  F  is  the  number  of  equivalence  classes  chosen  for  deadlines  (i.e.  comparable  deadlines),  s 
is  the  conventional  CSMA/CD  channel  sloi  time  and  (!„  represents  the  total  message  (n  of  them) 
transmission  time  (without  collisions). 

It  is  possible  to  obtain  bounds  also  for  the  algorithms  we  have  selected  to  solve  the  (more 
difficult)  combined  problem  of  distributed  Scheduling  and  multiple-object  Concurrency  Control. 

Validation  of  a  design  is  then  very  easy.  We  use  the  parameters  given  to  characterize  a  particular 
LDPART  system  (e.g.  arrival  laws,  individual  task/message  durations).  From  the  analytic 


expressions  of  bounds,  we  compute  the  corresponding  numerical  values  and  check  whether  they 
match  the  specified  system-wide  requirements/objectives.  If  not  the  case,  a  different  desi^ 
solution  must  be  considered  or  additional  a  priori  knowledge  must  be  provided  (although  this  will 
reduce  the  coverage  factor  of  the  corresponding  solutions). 

3.4.  Fault-tolerance 

With  LDPART  systems,  many  abnormal  behaviors  (called  failures)  could  result  from 
multiplexing  together  an  a  priori  unknown  number  of  application  S/W  modules  (transactions) 
over  an  a  priori  unknown  number  of  computational  resources.  Given  our  proof-based  approach 
and  our  computational  model,  such  failures  cannot  exist,  if  it  can  be  assumed  that : 

(i)  every  individual  transaction  is  correctly  implemented, 

(ii)  every  algorithm  is  correctly  implemented. 

We  are  thus  restricting  the  scope  of  fault-tolerance  in  LDPART  systems  to  the  familiar  area  of 
conventional  S/W  engineering  and  need  only  draw  solutions  from  current  state-of-the-art  in  Fault 
Tolerance. 

We  will  not  discuss  here  the  relative  merits  and  drawbacks  of  S/W  engineering  solutions  directed 
at  producing  dependable  S/W.  Whatever  solution  is  used,  operational  S/W  still  contains  «bugs». 
The  best  (non  speculative)  solutions  to  this  problem  today  are  those  used  in  Tandem  systems 
(process  pairs,  primary/backup,  checkpoints)  and  in  Stratus  systems  (process  pairs,  dual  duplex 
processing).  Although  process  replication  docs  not  solve  the  «solid  bug»  problem  (S/W  design 
faults),  experience  indicates  (statistics  published  by  Tandem  Corp.  in  particular)  that  process 
replication  solves  most  H/W  failure  related  problems  as  well  as  the  «transicnt  bug»  problem, 
which  seems  to  largely  dominate  the  «solid  bug»  one. 

Replication  of  process  execution  at  run-time  and  replication  of  H/W  as  well  as  of  system  S/W 
(OS)  need  to  be  hidden.  It  is  no  coincidence  that  the  Transactional/Object  Orientation  Model  has 
proved  itself  as  being  well  suited  to  provide  application  S/W  developers  with  a  programming 
interface  that  fully  hides  the  intricacies  of  the  solutions  used  to  achieve  a  given  degree  of  system- 
wide  fault-tolerance.  This  is  not  surprising  indeed,  as  the  same  design  Model  has  proved  itself  to 
be  well  suited  also  to  hide  the  intricacies  of  the  Distributed  Concurrency  Control  and  Scheduling 
solutions,  as  indicated  above. 

Regarding  fault-tolerance,  from  a  more  general  (and  maybe  theoretical)  viewpoint,  one  must  be 
aware  of  the  fact  that  the  «Rcal-Timc  conununity»  implicitly  relies  on  what  is  inappropriately 
called  a  «synchronous»  computational  model  (i.e.  a  model  where  upper  bounds  on  computation/ 
commuiucation  delays  are  known  a  priori).  Fault-tolerance  under  such  a  model  might  raise  the 
need  to  perform  on-line  assumption  verification  (that  the  bounds  are  not  violated).  This  is  a 
delicate  issue.  We  recommend  that  snecial  attention  be  paid  to  this  issue.  Especially  when  systems 
are  «large»,  assumptions  on  a  priori  knowledge  of  «good»  bounds  might  have  a  bad  coverage 
factor.  Hence,  it  might  be  necessary  to  use  a  «partially  synchronous»  computational  model  (i.e. 
bounds  exist  but  are  not  known  a  priori)  which  still  permits  deterministic  solutions,  or  an 
«asynchronous»  computational  model  (i.e.  bounds  do  not  exist),  where  only  probabilistic  or 
randomized  solutions  can  be  contemplated. 


A  last  point  might  be  worth  addressing,  that  is  testing/debugging  of  LDPART  systems.  As  we  still 
do  not  know  how  to  produce  fault-free  S/W,  we  still  have  to  rely  on  testing/debugging.  It  is  often 
heard  that  this  task  is  rendered  more  difficult  in  the  case  of  distributed  architectures.  We  do  not 
understand  this  statement.  We  suspect  that  it  might  result  from  a  lack  of  understanding  of  what  is 
needed  for  a  real-time  distributed  system  to  ran  correctly.  We  have  shown  that  CC  algorithms  are 
needed,  in  particular  to  enforce  particular  orderings  or  histories  on  sets  of  concurrent  events. 
Typically,  CC  algorithms  use  attributes  representing  some  notion  of  logical  time,  such  as 
timestamps  or  tickets.  These  attributes  reflect  causality  relationships  among  events.  Similarly, 
global  physical  time  must  be  maintained  in  a  LDPART  system.  Physical  timestamps  reflect 
chronological  relationships  among  events. 

It  is  therefore  possible  to  record  separately  histories  or  traces  as  produced  by  individual 
components  of  a  LDPART  system  and  merge  them  consistently  by  using  the  logical/physical 
attributes  associated  to  events.  This  yields  the  possibility  of  observing  system-wide  causally  and 
chronologically  correct  histories,  as  is  the  case  with  conventional  (centralized)  systems. 

4.  CONCLUSION 


We  have  developed  a  number  of  arguments  which,  we  hope,  should  help  our  community  to  save 
time  and  invest  resources  where  appropriate.  LDPART  systems  arc  raising  issues  which  cannot  be 
correctly  tackled  without  knowledge  of  current  state-of-the-an  in  Computer  Science.  Even  though 
very  simple  designs  have  been  satisfactory  in  the  past,  that  will  be  less  and  less  often  the  case.  The 
precambrian  era  is  over.  Given  the  continuing  advances  made  in  H/W,  we  feel  it  mandatory  to 
adopt  an  open-minded  approach  to  the  challenging  issues  raised  with  LDPART  systems,  so  as  to 
avoid  facing  somewhat  embarassing  conclusions  such  as  not  allowing  oneself  to  use  off-the-shelf 
technology  (c.g.  cache-memory  based  computers)  because  they  do  not  fit  an  excessively  poor/ 
naive  computational  model. 
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INTRODUCTION 

In  recent  years,  significant  progress  has  been  made  in  the  design  and  evaluation  scheduling 
algorithms  that  are  basic  building  blocks  of  real-time  systems  containing  one  or  a  few  processors  and 
supporting  traditional  embedded  applications.  In  particular,  the  rate-monotonic  approach  [1-4]  to 
scheduling  real-time  computations  and  data  communications  is  well  developed.  Algorithms  for  assigning 
periodic  tasks  to  processors,  processor  scheduling,  I/O  bus  and  broadcast  network  access,  synchronization 
between  periodic  tasks,  and  interrupt  handling  are  now  available  (e.g.,  [5-8]).  There  is  a  growing 
collection  of  rigorous  performance  bounds  and  simulation/measurement  data  to  support  design  choices 
and  decisions.  Systematic  design  and  synthesis  methods  and  sound  validation  and  testing  strategies  are 
beginning  to  emerge. 

In  contrast,  the  basic  methodologies  needed  to  build  and  validate  distributed  real-time  systems  are 
not  yet  available.  This  is  especially  true  when  the  applications  are  complex  and  dynamic  and  have 
critical  timing  constraints.  Many  problems  on  scheduling  and  resource  access  control  of  distributed  real¬ 
time  applications  remain  to  be  solved.  Systematic  design  and  synthesis  methods  and  tools  need  to  be 
built  on  not  only  the  solutions  to  these  problems  but  also  rigorously  established  boundaries  governing  the 
correct  and  safe  usage  of  the  solutions.  Examples  of  key  problems  include  how  to  schedule  tasks  to  meet 
their  end-to-end  timing  constraints  in  distributed  and  parallel  enviromnents;  how  to  make  scheduling  and 
resource  access  control  decisions  when  information  needed  to  support  such  decisions  are  old,  partial  or 
unavailable;  how  to  predict  and  prevent  oscillatory  behavior  and  instability  of  the  resultant  systems  built 
on  dynamic  scheduling  strategies;  and  how  to  ensure  graceful  degradation  in  the  presence  of  overloads 
and  failures. 

This  paper  first  describes  a  view  of  distributed  and  parallel  real-time  systems  and  suggests  a 
hierarchical  approach  to  scheduling  and  resource  management  and  to  reasoning  about  the  timing  behavior 
of  large  distributed  systems.  It  then  discusses  the  key  problems  mentioned  above  and  concludes  by 
giving  additional  arguments  on  why  the  design  of  real-time  systems  should  have  sound  theoretical 
foundations  in  real-time  scheduling. 

MODELS  OF  LARGE,  DISTRIBUTED,  AND  PARALLEL  SYSTEMS 

By  a  large  system,  we  mean  one  that  contains  hundreds  or  thousands  of  tasks.  Here,  the  terms  tasks 
and  subtasks  loosely  refer  to  individual  units  of  work  that  are  allocated  resources  and  executed  to  support 
the  functionalities  of  the  system.  For  example,  a  task  or  a  subtask  may  be  a  granule  of  computation,  a 
unit  of  data  transmission,  a  response  to  a  query,  or  even  an  operator  action.  It  executes  on  a  computer,  a 
data  link,  a  database,  or  an  operator  console.  We  model  all  resources  on  which  tasks  execute  abstractly  as 
processors. 

A  system  may  contains  many  different  types  of  processors.  Processors  of  different  types  cannot  be 
used  interchangeable  either  because  they  are  functioruQly  different  (e.g.,  signal  processors,  data 


processors,  and  communication  links  are  functionally  different  processors),  or  because  they  are  located  at 
different  places  (e.g.,  the  data  links  connected  to  onboard  sensors  are  distinct  from  the  links  to  a  decision 
support  system  on  the  ground.)  On  the  other  hand,  processors  of  the  same  type  are  considered  to  be 
identical  or  compatible  because  they  can  be  used  interchangeably. 

Concurrency 

In  a  complex  distributed  system,  a  task  typically  consists  of  subtasks  that  are  dependent  on  each 
other  and  execute  in  turn.  For  example,  in  an  air  traffic  control  system,  the  sequence  of  operations  that  is 
carried  out  when  an  aircraft  first  enters  a  coverage  area  is  such  a  task.  It  consists  of  a  sequence  of 
subtasks  which  model  the  processing  of  the  radar  sensor  data  on  a  signal  processor  to  generate  a  track 
record,  the  transmission  of  the  track  record  to  a  data  processor,  the  correlation  of  the  track  record  with 
other  records  on  the  data  processor,  the  transmission  of  the  aircraft  characteristics  from  the  data  processor 
to  an  intelligent  decision  support  system,  the  aruilysis  by  the  decision  support  system,  and  so  on,  until  the 
correct  action  is  taken.  The  processors  on  which  the  subtasks  execute  are  different  because  they  support 
different  functions.  On  the  other  hand,  the  processors  that  handle  the  displays  in  an  airport  control  tower 
and  an  enroute  control  center  may  be  fimctionally  identical.  We  nevertheless  consider  them  to  be  of 
different  types  because  we  do  not  want  to  execute  the  subtask  that  updates  one  display  on  the  processor 
directly  connected  to  the  other  display  under  normal  operating  conditions.  Similariy,  in  a  command, 
control  and  communication  system,  the  task  of  sending  a  message  can  be  decomposed  into  subtasks  that 
route  and  transmit  the  message  from  the  sender  to  the  receiverfs)  in  a  large  network  of  data  links  and 
switches.  The  switch  and  link  conneaed  to  the  sender  are  different  from  the  switch(es)  and  link(s) 
connected  to  the  receiverfs)  because  they  cannot  be  used  interchangeably. 

From  the  examples  above,  we  see  that  concurrency  in  a  distributed  system  arises  naturally  as 
subtasks  of  different  tasks  sequencing  through  different  processors  in  a  pipeline  fashion.  This  type  of 
concurrency  can  be  captured  to  a  great  extent  by  the  classical  job  shop  and  flow  shop  models  and  their 
variations  [9-12].  The  former  models  a  system  in  which  the  subtasks  in  different  tasks  execute  on 
different  processors  in  arbitrary  orders,  while  the  laner  models  a  system  in  which  the  subtasks  in  different 
tasks  execute  on  different  processors  in  the  same  order. 

A  variation  of  the  classical  model  is  the  constrained  job  shop’,  this  model  characterizes  a  distributed 
system  as  a  set  of  heterogeneous  flow  shops  that  share  processors.  In  other  words,  a  system  contains 
many  classes  of  tasks:  tasks  in  the  same  class  execute  on  different  processors  in  the  same  order,  but  tasks 
in  different  classes  execute  on  different  prtxressors  in  different  orders.  The  corresponding  queuing 
theoretical  model  is  a  network  of  queues  with  many  job  classes  and  multiple  routing  chains.  Each  chain 
gives  the  order  in  which  tasks  in  a  class  execute  on  different  processors.  An  example  where  this  model  is 
appropriate  is  a  multihop,  real-time  network.  The  individual  flow  shops  model  virtual-circuit  connections 
established  in  the  network.  Each  connection  carries  multiple  streams  of  real-time  data  that  must  be 
delivered  from  one  end  of  the  connection  to  the  other  end  in  time. 

Other  variations  of  the  classical  flow  shop  and  job  shop  models  are  flow  shop  with  recurrence  and 
periodic  flow  shop  and  job  shop  [12].  In  a  flow  shop  with  recurrence,  each  task  executes  more  than  once 
on  one  or  more  processors.  This  variation  models  a  system  that  docs  not  have  a  dedicated  processor  for 
every  funaiort  An  example  is  a  control  system  containing  three  computers:  a  sensor  data  processor,  a 
compulation  server,  and  an  actuator  command  generator.  The  computers  are  connected  by  a  token  ring. 
We  can  model  the  token  ring  as  a  processor  and  the  system  as  a  flow  shop  with  recurrence.  Each  task 
executes  first  on  the  sensor  data  processor,  then  on  the  ring,  on  the  computation  server,  on  the  ring  again, 
and  finally  on  the  command  generator.  In  a  periodic  flow  shop  or  job  shop,  each  usk,  and  hence  each 


subtask,  is  a  periodic  sequence  of  requests  for  the  same  compuution  or  communication.  A  multi-hop 
connection  that  is  used  to  transmit  periodic  multimedia  data  and  a  distributed  air  traffic  control  system 
that  periodically  processes  radar  returns,  tracks  aircrafts  and  displays  their  flight  paths  can  be  modeled  as 
periodic  flow  shops  or  jobs  shops. 

Parallelism 

In  addition  to  concurrent  executions  of  subtasks  on  different  types  of  processors,  parallel  executions 
are  feasible  whenever  there  are  more  than  one  processor  of  same  type.  In  an  air  traffic  control  system,  for 
example,  there  may  be  an  array  of  signal  processors,  making  it  possible  to  execute  many  signal 
processing  subtasks  of  different  tasks  in  parallel.  Similariy,  multiple  links  and  switches  between  the 
sender  and  the  receiver(s)  provide  parallel  paths  that  can  be  used  to  increase  throughput  or  reduce 
message  delay.  The  traditional  multiprocessor  and  redundant-processor  models  (e.g.  [13-18])  used  in 
studies  on  parallel  and  distributed  scheduling,  task  assignment  and  load  balancing  capture  this  type  of 
parallelism.  Such  a  model  characterizes  a  subsystem  abstractly  as  a  set  of  processors  that  are  either 
identical  or  compatible;  a  subtask  can  execute  on  any  of  them.  Each  subtask  may  be  further  divided  into 
subtasks  of  smaller  granularity  so  that  the  degree  of  parallelism  can  be  increased  with  possible 
accompanied  increases  in  communication  and  scheduling  overtieads. 

Some  processors  may  in  fact  be  massively  parallel  machines.  Examples  of  such  processors  include 
some  parallel  systems  designed  to  do  image  enhancement,  feature  extraction  and  object  identification. 
Fine-grain  parallelism  can  be  exploited  to  speed  up  the  complenon  of  subtasks  only  when  we  have  good 
parallel  computation  algorithms  and  parallelizing  compilers.  The  issue  here  is  not  about  real-time 
scheduling. 

REAL-TIME  CONSTRAINTS 

By  a  real-time  system  here,  we  mean  specifically  a  computing  aiul  communication  system  in  which 
a  significant  number  of  of  tasks  have  critical  timing  constraints.  Timing  constraints  of  a  task  can  almost 
always  be  expressed  in  terms  of  its  (absolute  or  relative)  release  time  and  deadline.  The  former  is  the 
time  instant  after  which  it  can  begin  execution.  The  latter  is  the  time  instant  by  which  it  must  be 
completed.  A  timing  fault  is  said  to  occur  when  one  or  more  timing  constraints  are  violated.  A  real-time 
system  operates  correctly  only  in  the  absence  of  timing  faults. 

Timing  constraints  that  follow  naturally  from  high-level  requirements  of  distributed  real-time 
applications  are  typically  end-to-end  in  nature.  For  example,  in  a  collision  detection  and  avoidance 
system,  the  maximum  allowed  elapse  of  time  from  the  instant  when  a  target  is  detected  to  the  instant 
when  an  evasive  action  must  be  taken  is  determined  by  how  soon  a  collision  can  occur.  The  requirement 
that  a  correct  evasive  action  is  taken  in  time  imposes  an  overall  deadline  on  the  task  consisting  of  a 
sequence  of  computation  and  communication  subtasks.  (These  subtasks  process  radar  returns,  identify 
the  target,  compute  its  course,  choose  the  evasive  action,  and  generate  and  send  the  commands  to  the 
actuators.)  As  long  as  the  sequence  is  completed  by  the  deadline,  it  is  not  important  when  the 
intermediate  computational  subtasks  are  done  or  how  long  messages  are  delayed. 

In  other  words,  to  meet  end-to-end  timing  constraints  of  a  task,  we  are  required  to  begin  executing 
its  first  subtask  at  or  after  its  release  time  and  complete  the  execution  of  its  last  :,ubusk  by  its  deadline. 
The  intermediate  subiasks  have  only  derived  release  times  and  deadlines;  their  executions  are  constrained 
only  by  the  dependencies  between  them  and  by  the  fact  that  they  must  be  completed  sufficiently  eariy  to 
allow  the  on-time  completion  of  the  last  subtask.  Their  lack  of  application-imposed  timing  constraints 
provides  the  system  with  more  freedom  in  scheduling  the  intermediate  subtasks.  This  freedom,  together 


with  our  desire  to  take  advantage  it  to  achieve  greater  efficiency  in  resource  usage,  increases  the 
complexity  of  scheduling  and  resource  access  control  in  distributed  and  parallel  environments. 

HIERARCHICAL  APPROACH 

The  characteristics  of  concurrency  and  parallelism  in  task  executions  and  the  end-to-end  nature  of 
their  timing  requirements  suggest  a  hierarchical  approach  to  scheduling  in  distributed  environments.  The 
primary  goal  of  scheduling  and  resource  access  control  in  a  distributed  real-time  system  should  be  to 
enforce  the  end-to-end  timing  constraints  that  directly  follow  from  high-level  timing  requirements  of  the 
applications  supported  by  the  system.  In  end-to-end  scheduling  of  time-critical  tasks,  we  want  to  assign 
intermediate  release  times  and  deadlines  to  their  subtasks  and  to  schedule  the  subtasks  on  the  individual 
processors  so  as  to  ensure  the  completion  of  all  tasks  in  time  whenever  it  is  feasible  to  do  so.  We  want 
the  resultant  system  to  be  responsive  to  varying  demands,  to  degrade  gracefully  in  the  presence  of 
overloads  and  failures,  and  to  be  easy-to-modify,  maintain,  validate  and  test 

Scheduling  subtasks  on  interchangeable  processors  have  a  set  of  secondary  goals,  including  to  make 
good  use  of  parallelism,  to  maximize  the  likelihood  of  on  time  completion  of  subtasks,  to  equalize 
resource  utilizations,  to  provide  redundancy,  to  increase  availability,  etc.  The  traditional  focus  of 
research  on  distributed  and  parallel  systems  has  been  on  ways  to  meet  these  secondary  goals.  Past  works 
on  parallel  and  distributed  scheduling  have  produced  many  excellent  task  assignment  and  load  balancing 
schemes,  as  well  as  performance  bounds  of  multiprocessor  scheduling  algorithms;  examples  of  these 
results  can  be  found  in  [13-19].  These  schemes  can  enhance  end-to-end  scheduling  algorithms  in  order  to 
improve  the  overall  robusmess,  efficiency,  and  availability  of  a  distributed  system  but,  by  themselves,  are 
not  solutions  to  the  end-to-end  scheduling  problem. 

To  illustrate  the  relation  between  the  end-to-end  scheduling  problem  and  the  traditional  distributed 
scheduling  problems,  we  note  that  one  way  to  do  end-to-end  scheduling  is  to  assign  intermediate  release 
times  and  deadlines  to  all  subtasks.  Af  the  intermediate  release  times  and  deadlines  arc  assigned,  we 
can  then  use  some  task  assignment,  multiprocessor  scheduling,  load  balancing  and  task  migration 
schemes  to  schedule  the  subtasks  on  processors  of  each  type,  trying  to  make  the  best  use  of  parallel 
resources.  Therefore,  an  end-to-end  scheduler  can  be  viewed  as  a  high-level  system  module  tha*  contains 
multiprocessor  schedulers,  load  balancers  and  resource  managers  as  components. 

END-TO-END  SCHEDULING  ISSUES 

We  now  examine  several  issues  in  end-to-end  scheduling  that  remain  to  be  addressed.  They  are 
concerned  with  increasingly  more  complex  and  dynamic  situations,  listed  here  according  to  the  amount  of 
load  and  status  information  available  to  support  scheduling  decisions  and  the  costs  of  maintaining  this 
information. 

Scheduling  with  Global  Information 

Almost  all  available  end-to-end  scheduling  algorithms  that  are  supported  by  rigorous  performance 
bounds  and  profiles  are  suited  only  for  systems  and  subsystems  that  are  cither  sufficienUy  small  and 
tighUy-coupled  or  sufficienUy  suuc.  Examples  of  the  former  include  flight  control  and  radar  signal 
processing  systems,  which  are  reaUy  mulUprocessor  systems.  Examples  of  the  latter  include  industrial 
process  control,  multihop  virtual-circuit  networks,  and  flight  management  systems  under  their  normal 
operating  conditions.  In  these  systems,  it  is  feasible  to  collea  and  distribute  global  load  and  status 
information  and  keep  the  information  sufficienUy  current  ConsequenUy,  it  is  reasonable  to  assume  that 
the  scheduler  for  each  processor,  or  each  type  of  processors,  knows  the  Uming  and  resource  requirements 
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of  all  the  tasks  in  the  system  as  well  as  the  decisions  made  by  other  schedulers.  Therefore,  it  is  feasible 
for  the  schedulers  to  work  closely  together  and  arrive  at  compatible  decisions,  or  even  do  scheduling 
centraDy. 

Again,  a  reasonably  realistic  workload  model  for  systems  on  which  global  status  infoimation  is 
available  is  the  constrained  job  shop  model.  Rather  than  scheduling  all  the  tasks  from  all  classes  together 
according  to  one  algorithm,  a  practical  strategy  is  to  partition  the  time  available  on  processors  of  each 
type  and  allocate  them  to  task  classes.  The  processor  time  allocations  are  adjusted  on  an  infrequent  basis, 
(during  mode  changes,  new  connection  establishments,  etc.  for  example).  This  allows  the  tasks  in  each 
class  to  be  scheduled  according  to  a  scheduling  algorithm  suited  for  the  class.  A  system  that  contains  N 
classes  of  tasks  and  uses  such  a  semi-static  partition  and  allocation  strategy  can  be  modeled  as  a  system  of 
N  flow  shops.  Many  known  algorithms  designed  for  scheduling  in  flow  shops  and  job  shops  to  meet  end- 
to-end  deadlines  or  to  minimize  lateness  and  algorithms  for  scheduling  jobs  in  factories  (e.g.  [12.19.20]) 
are  likely  to  be  applicable.  These  algorithms  should  be  studied  and  thoroughly  evaluated. 

Algorithms  for  assigning  intermediate  release  times  and  deadlines  to  subtasks  in  periodic  flow  shops 
and  job  shops  are  also  emerging.  One  way  assumes  the  use  of  a  preemptive  static-priority-driven 
algorithm  (e.g.  the  rate-monotonic  or  deadline-monoionic  algorithm)  to  schedule  subtasks  on  each  type 
of  processors.  In  this  case,  the  rate-monotonic  (or  deadline-monotonic)  technique  can  be  extended 
straightforwardly  to  deal  with  end-to-end  scheduling.  Another  example  of  periodic  job  shop  scheduling 
algorithms  uses  a  novel  convex  programming  method  and  an  iterative  modification  method  to  divide  the 
feasible  interval  between  the  release  time  and  deadline  of  every  task  into  segments  in  order  to  assign 
intermediate  deadlines,  to  check  whether  schedulability  conditions  on  all  processors  are  met.  and  to 
modify  the  intermediate  deadlines  if  necessary. 

How  to  integrate  task  assignment,  load  balancing  and  concurrency  control  functions  into  end-to-end 
scheduling  is  a  problem  that  has  not  yet  received  much  attention.  Sound  strategies  need  to  be  developed. 
Strategies  and  algorithms  assuming  current  global  information  can  serve  as  benchmarks  against  which  we 
can  measure  the  effectiveness  of  algorithms  that  do  not  rely  on  the  global  status  information. 

Scheduling  Based  on  Local  and  Global  Information 

In  large  and/or  dynamic  systems  the  information  about  the  global  system  state  is  not  always 
available  and  is  usually  too  costly  to  keep  curtm.  This  case  of  the  end-to-end  scheduling  problem 
resembles  the  problems  in  routing  and  flow  control  in  packet-switched  networks.  The  major  difference 
between  the  problems  is  in  the  primary  objectives  of  scheduling:  we  want  to  make  sure  that  all  time- 
critical  tasks  will  meet  their  end-to-end  deadlines  while  the  objectives  of  muting  and  flow  control  are  to 
keep  the  average  end-to-end  response  time  low  and  the  overall  throughput  high,  and  to  ensure  fairness 
among  tasks.  Nevertheless,  there  are  good  lessons  to  be  learned  from  strategies  in  routing  and  flow 
control  for  coping  with  missing  or  old  global  information. 

Using  Periodically  Updated  Global  Irrformation  —  For  example,  the  strategy  used  in  a  version  of 
the  arpanet  routing  algorithm  is  to  let  each  switch  make  its  routing  decisions  based  on  its  own 
information  on  the  global  load  condition.  This  information  is  updated  periodically.  The  rationale  behind 
this  strategy  is  that  the  information  will  remain  sufficiently  current  throughout  the  update  period  and  the 
routes  chosen  based  on  this  information  will  b^  sufficiently  good.  A  similar  strategy  for  end-to-end 
scheduling  is  feasible  when  ilie  system  consists  mostly  of  periodic  tasks  and  sporadic  tasks  with  bounded 
interarrival  rates.  The  number  of  subtasks  on  each  processor  (type)  and  their  total  utilization  can  be  made 
available  periodically  to  all  schedulers  in  the  system,  for  instance.  The  scheduler  for  each  processor  can 


use  this  information  to  estimate  when  new  subtasJcs  are  likely  arrive.  This  makes  it  possible  for  the 
scheduler  to  do  some  pre-planning,  to  increase  the  schedulability  and  reduce  the  completion  times  of  its 
subtasks.  Design  parameters  of  this  strategy  are  the  amount  of  information  exchanged  by  the  schedulers 
and  the  length  of  the  update  periods.  How  to  choose  these  parameters  based  on  the  time  parameters  and 
dynamics  of  the  task  system  is  a  question  to  be  answered. 

Using  Local  Information  Only  —  Rather  than  scheduling  the  executions  of  all  the  subtasks  in  each 
task  in  a  coordinated  manner  as  discussed  earlier,  an  opposite  approach  is  to  have  the  scheduler  on  each 
processor  decide  when  to  execute  what  subtasks  based  on  the  information  it  has  about  the  parameters  of 
its  own  subtasks  alone.  No  information  is  exchanged  between  schedulers.  In  this  case,  a  scheduler 
carmot  predict  the  arrivals  of  new  subtasks  and  their  parameters.  Moreover,  the  execution  times  of  later 
subtasks  may  be  unknown;  the  given  end-to-end  deadlines  do  not  provide  much  information  on  when  the 
intermediate  subtasks  must  be  completed.  There  are  many  reasons  to  believe  that  we  will  have  poor 
performance  when  scheduling  decisions  are  based  solely  on  local  information.  Under  the  assumptions 
stated  above,  there  is  no  choice  but  to  schedule  the  subtasks  on  each  processor  completely  on-line.  It  is 
known  that  the  performance  of  completely  on-line  scheduling  is  poor  [21].  We  can  also  view  an  end-to- 
end  schedule  produced  independently  by  the  individual  schedulers  as  a  priority-driven  schedule  based  on 
decisions  that  are  at  best  locally  optimal.  It  is  known  that  priority-driven  scheduling  strategies  have 
unacceptably  poor  worst-case  performance  in  systems  that  contain  functionally  dedicated  processors  [16]. 
In  despite  of  all  the  preliminary  evidence  against  it.  local  scheduling  approach  should  be  thoroughly 
evaluated  for  two  reasons.  First,  this  approach  is  the  only  feasible  one  in  extreme  situations  when  no 
prior  knowledge  about  tasks  parameters  are  available;  even  when  any  task  will  be  released  and  ready  for 
execution  is  unknown.  Second,  its  performance  data  will  give  us  a  set  of  benchmarks  on  one  end  of  the 
performance  spectrum,  with  the  data  on  algorithms  for  scheduling  viith  complete  knowledge  on  the  other 
end. 


Using  Both  Local  and  Global  Irrformation  —  The  strategies  that  use  both  local  and  global 
information  and  combine  local  decisions  with  global  decisions  are  likely  to  be  effective.  In  particular, 
local  infoimation  that  is  current  can  be  used  to  supplement  the  global  information  that  becomes  old.  In 
this  way,  we  can  reduce  the  update  frequency  of  the  global  information  and  improve  the  responsiveness 
of  the  system.  A  well-known  example  of  strategies  for  combining  global  and  local  decisions  is  delta 
routing  in  networks  [22].  Analogous  strategies  for  combining  global  information  with  local  information 
to  make  end-to-end  scheduling  more  responsive  and  robust  while  keeping  the  cost  of  maintaining  stanrs 
information  low  should  be  explored.  Specifically,  hybrid  scheduling  strategies  combine  off-line 
scheduling  based  on  global  information  with  on-line  scheduling  based  on  local  information.  The  global 
schedule(s)  is  modified  infrequently.  Some  dynamic  conditions  have  short  time  constants,  making  it 
impossi^'le  or  too  costly  to  maintain  global  information  about  them.  Responding  to  these  conditions  is 
taken  care  by  local  schedulers.  They  try  to  schedule  on-line  the  unexpected  tasks  as  best  as  they  can 
based  on  their  local  information  and  the  guideline  given  by  the  global  schedule.  These  strategies  fit  well 
in  the  framework  of  the  rate-monotone  paradigm  and  the  imprecise  computation  paradigm  [24-27]. 

Dynamic,  Adaptive  and  Monitor-Based  Scheduling 

Algorithms  known  as  dynamic  algorithms,  adaptive  algorithms  and  monitor-based  algorithms 
assume  that  the  system  condition  is  monitored  and  scheduling  decisions  are  based  on  the  observed 
condition.  (We  will  simply  refer  to  them  all  as  adaptive  algorithms.)  In  some  cases,  decision  rules  may 
also  change  as  the  observed  condition  changes. 


It  is  well  known  that  adaptiveness  can  lead  to  oscillatory  behavior  and  instability.  We  cannot  use  a 
adaptive  algorithm  for  scheduling  ftmctionally  critical  tasks  until  we  thoroughly  understand  the  dynamic 
behavior  of  the  resultant  system  and  have  effective  methods  for  predicting  oscillatory  behavior  and 
instability  and  for  preventing  them.  One  way  to  prevent  insubility  and  oscillatory  behavior  is  to  make 
adaptive  schedulers  less  sensitive  to  changes  in  the  system  condition.  Another  way  is  to  avoid 
synchronous  reactive  actions  of  all  the  schedulers.  Unfortunately,  all  the  effective  methods  tend  to  make 
the  system  less  responsive  and  impose  a  limit  on  how  dynamic  a  system  can  be.  It  is  important  for  us  to 
know  this  limitation. 

Enhancing  Graceful  Degradation 

The  imprecise  computation  technique  [24-27]  is  a  natural  way  to  provide  graceful  degradation  and 
to  cope  with  uncertainty  in  system  load.  We  call  a  system  based  on  this  technique  an  imprecise  system. 
In  an  imprecise  system,  each  time-critical  task  is  structured  in  such  a  way  that  it  can  be  logically 
decomposed  into  two  portions;  a  mandatory  portion  and  an  optional  portion.  The  mandatory  portion  of 
the  task  must  be  completed  to  produce  an  approximate  result  of  an  acceptable  quality.  This  portion  of  a 
time-critical  task  must  be  completed  by  the  deadline  of  the  task.  The  optional  portion  refines  the 
approximate  result  produced  by  the  mandatory  portion.  We  can  choose  to  leave  this  portion  unfinished 
and  terminated  prematurely  if  necessary,  with  an  accompanied  reduction  in  the  result  quality. 

An  example  of  where  the  imprecise  computation  technique  is  applicable  is  facsimile  transmission. 
When  the  progressive  buUt-up  method  is  used,  the  data  encoding  each  still  image  is  divided  into  four 
blocks;  each  additional  block  gives  a  clearer  image.  The  mandatory  portion  is  the  transmission  of  the  first 
one  or  two  of  the  four  blocks  that  gives  a  fuzzy  but  intelligible  image.  Other  blocks  can  be  discarded 
when  the  network  is  congested.  Other  examples  include  transmissions  of  compressed  voice  and  video, 
tracking  and  feedback  control.  In  each  case,  we  often  prefer  to  have  a  timely  result  of  a  poorer  quality 
than  a  late  result  of  the  desired  quality. 

The  imprecise  computation  technique  makes  meeting  all  timing  constraints  significantly  easier  for 
the  following  reason.  To  ensure  that  all  deadlines  are  met,  we  only  need  to  ensure  that  all  the  mandatory 
portions  of  all  tasks  are  completed  by  their  deadlines.  This  can  be  done  by  restricting  all  the  mandatory 
portions  to  have  bounded  execution  time  and  resource  requirements  and  scheduling  them  in  a 
conservati*'e  and  robust  way.  The  leftover  system  resources  can  be  used  to  complete  as  many  optional 
portions  as  possible.  It  is  not  necessary  to  eliminate  non-determinism  in  the  timing  and  resource 
requirements  of  optional  subtasks,  thus  allowing  greater  freedom  in  their  design  and  implementation.  For 
different  types  of  applications,  the  costs  and  benefits  in  the  tradeoff  between  the  result  quality  and 
execution  time  requirements  are  more  appropriately  measured  by  different  criteria.  Many  scheduling 
algorithms  have  been  developed  to  tradeoff  according  to  these  criteria,  including  the  ones  described  in 
[24-26].  Among  the  existing  algorithms  for  scheduling  imprecise  computations,  many  can  be  modified 
for  end-to-end  scheduling.  Examples  include  the  class  of  algorithms  for  scheduling  periodic  subtasks  and 
for  on-line  scheduling.  The  imprecise  compulation  approach  to  flow  and  congestion  control  is  feasible 
for  voice,  video  and  other  messages  as  long  as  they  are  encoded  using  a  technique  that  allows  their 
transmissions  to  be  imprecise. 

SUMMARY 

The  hierarchical  approach  to  scheduling  and  resource  access  control  points  to  a  hierarchical 
decomposition  approach  to  design,  synthesize  and  test  large  real-time  systems:  A  large  system  is 
decomposed  into  subsystems.  From  scheduling  point  of  view,  each  subsystem  consists  of  subtasks  that 


execute  on  processors  of  the  same  type.  The  timing  requirements  of  the  individual  subsystems  ate 
derived  from  the  timing  requirements  of  the  system  as  a  whole.  To  ensure  that  all  timing  constraints  of 
the  system  are  met  involves  making  sure  that  the  timing  constraints  of  each  subsystem  are  met  and  that 
the  end-to-end  scheduling  strategy  robustly  enforces  the  overall  timing  constraints  as  it  integrates  the 
subsystems  together.  The  hierarchical  decomposition  approach  is  in  faa  old  and  commonly  used.  Its 
effectiveness  in  dealing  with  large  distributed  real-time  systems  is  being  questioned  primarily  for  the 
following  reason.  The  intermediate  timing  constraints  of  subsystems  are  typically  assigned  in  a  trial- 
and-error  manner,  and  the  end-to-end  timing  constraints  are  often  not  met  when  the  subsystems  designed 
to  meet  their  assigned  timing  constraints  are  scheduled  together.  What  we  need  to  make  this  approach 
effective  are  principles  in  end-to-end  scheduling  and  resource  allocation  that  can  guide  the  decomposition 
and  integration  processes. 

Past  experience  has  led  us  to  believe  that  a  large  distributed  system  with  critical  timing  constraints 
must  be  designed  and  built  on  sound  and  rigorous  scheduling  theories.  Such  a  system  has  an  intractably 
large  number  of  states.  It  will  be  impractical,  or  even  impossible,  to  validate  and  test  the  system  in  an 
enumerative  and  exhaustive  manner.  We  need  non-exhaustive  validation  and  testing  strategies  that  have 
provably  complete  coverage.  A  hierarchical  approach  to  scheduling  and  resource  access  control  should 
lead  to  system  architectures  for  which  such  strategies  are  likely  to  be  feasible. 

Many  distributed  systems  built  to  date  are  known  to  exhibit  nondeterministic,  oscillatory  and 
unstable  behaviors.  Because  it  is  difficult  to  produce  the  conditions  under  which  the  system  behaves  in 
such  an  undesirable  manner  during  testing,  the  fact  that  they  can  occur  is  often  discovered  when  the 
system  has  been  put  in  use  and  failed  unexpectedly.  We  need  scheduling  theory  to  provide  us  with  not 
only  algorithm  and  protocols  to  map  tasks  to  processors  and  resources,  but  also  insights  and  thorough 
understanding  of  the  resultant  system  so  that  we  can  piedin  and  eliminate  the  oscillatory  and  unstable 
conditions. 
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In  recent  years,  the  processes  involved  in  the  development  of  large,  distributed,  parallel  real-time 
systems,  responding  to  the  major  changes  in  the  availability  of  high-performance  hardware,  h?ve 
undergone  a  massive  change  due  to  the  size  and  complexity  of  projects  undertaken.  Not  every  sys¬ 
tem  developed  in  these  environments  has  been  completed  successfully;  it  is  clear  that  the  difficul¬ 
ties  associated  with  developing  such  systems  have  increased  at  least  as  fast  as  the  increase  in 
computing  power.  Thus,  this  workshop  is  being  conducted  in  an  environment  in  which  many  of 
the  previous  “rules  of  thumb”  by  which  such  systems  have  been  conceptualized,  procured,  man¬ 
aged,  and  developed  have  become  obsolete. 

There  arc  four  questions  posed  by  the  Institute  for  Defense  Analyses  for  consideration  at  this 
workshop.  This  position  paper  considers  each  question  in  turn. 

INTRODUCTION 

Question  1 :  What  is  the  best  method  or  methodology  for  designing  large,  distributed  real-time 
systems  where  processing  elements  may  have  a  parallel  architecture? 

There  are  several  software  design  methodologies  (e.g..  Structured  Analysis)  which  arc  frequently 
recommended  and  used  for  real-time  system  design,  but  none  actually  address  the  central  require¬ 
ments  that  distinguish  a  real-time  system.  The  single  characteristic  of  a  real-time  system  which 
distinguishes  it  from  other  complex  systems  is  its  requirement  to  provide  bounded  response  times 
(usually  expressed  in  milliseconds  elapsed  between  an  input  and  its  required  external  response) 
for  all  or  part  of  its  input  domain.  This  is  in  sharp  contrast  to  non-real-time  systems  for  which  the 
central  performance  requirement  is  generally  stated  in  terms  of  throughput  (usually  expressed  in 
rates,  such  as  messages  per  second). 

The  requirement  to  provided  bounded  response  time  rather  than  some  level  of  throughput  creates 
a  fundamental  dichotomy  between  real-time  systems  and  non-real-time  systems  which  cannot  be 
bridged  by  simply  adding  efficiency  requirements  to  a  design  produced  by  any  of  the  currently 
favored  design  methodologies.  The  requirement  of  bounded  response  time  produces  the  need  to 
manage  all  shared  resources,  including  the  processors,  communications,  memory,  devices,  and 
interfaces  in  ways  that  differ  fundamentally  from  systems  whose  performance  is  characterized 
only  by  throughput  requirements.  For  example,  message  queues  for  throughput-driven  systems 
can  generally  use  FIFO  disciplines,  while  message  queues  for  response  time-driven  systems  must 
generally  use  time-driven  queueing  disciplines,  such  as  priority  or  deadlines. 


This  fundamental  difference  must  strongly  affect  the  design  process  at  the  highest  level.  All  basic 
system  components  ~  operating  system  resource  management,  language  run-time  processes, 
device  drivers,  communications  protocols  at  every  level,  and  application  software  architecture  — 
must  consistently  reflect  ^s  fundamental  difference.  In  distributed  real-time  systems  being 
designed  and  built  today  jwd  in  the  foreseeable  future,  most  of  these  components  are,  and  will 
increasingly  be  available  as  Commercial  Off-The  Shelf  (COTS)  components.  The  application 
software  architecture  must  be  constructed  to  use  these  components  in  such  a  way  that  its  response 
requirements  can  be  predictably  and  consistently  met. 

REAL-TIME  SOFTWARE  ARCHITECTURE 

The  application  software  architecture  can  be  thought  of  as  the  highest  level  of  application  design. 
As  with  every  level  of  design,  it  is  expressible  as  a  set  of  design  decisions  which  will  drive  all 
other  design  decisions  for  the  application.  A  real-time  application  software  architecture,  whether 
in  a  distributed,  parallel  or  uniprocessor  environment,  which  does  not  stan  with  the  consideration 
of  its  response  time  requirements,  is  highly  likely  to  encounter  serious  performance  problems 
later  in  its  implementation  life  cycle,  most  frequently  during  integration  and  test.  Design  problems 
found  during  integration  and  test  are  among  the  most  expensive  to  fix,  particularly  when  they  are 
traceable  to  the  software  architecture  itself. 

The  software  architecture  defines  just  how  the  hardware  will  be  used  to  satisfy  the  requirements. 
For  each  processing  node,  the  architecture  dictates  which  requirements  will  be  supponed,  how 
communications  will  be  handled  (including  contention  and  queueing  disciplines),  how  concur¬ 
rency  and  consistency  will  be  provided  and  controlled,  and  how  faults  (either  hardware,  software, 
or  both)  will  be  detected,  corrected,  logged,  and  managed. 

For  real-time  systems,  the  software  architecture  must  stan  with  the  answers  to  several  require- 
ments-driven  questions.  Some  of  the  most  important  of  these  questions  are: 

1.  Is  the  design  to  be  event  driven  or  time  driven?  2.  How  many  different  time  constraints  arc 
present  in  the  application?  3.  Are  the  time  constraints  hard  or  soft?  If  soft,  what  is  the  nature  of  the 
actual  constraints?  If  hard,  are  they  primarily  periodic,  or  aperiodic?  4.  Are  the  inputs  bounded  in 
quantity  and  interarrival  time?  If  not,  what  arc  their  arrival  characteristics?  5.  Arc  the  algorithms 
to  be  used  characterized  by  bounded  execution  time? 

The  answers  to  these  questions  must  drive  the  choice  of  software  architecture,  including  the  num¬ 
ber  of  tasks  (a  task  is  a  separately  schedulable  sequential  procedure,  without  implications  of  pres¬ 
ence  or  absence  of  shared  information  or  communication  with  other  tasks),  use  of  priorities, 
communication  techniques,  and  synchronization  techniques.  Only  after  these  questions  are 
answered  should  a  design  methodology  be  chosen.  From  the  real-time  perspective,  the  methodol¬ 
ogy  choice  can  be  arbitrary,  but  the  requirements  for  bounded  and  predictable  time  management 
must  be  observed  throughout  the  process,  since  the  methodologies  themselves  will  ignore  that 
issue. 

It  is  important  to  note  that  the  presence  of  distribution  and  parallelization  will  have  a  significant 
effect  on  the  choice  of  software  architecture.  At  present,  there  are  no  completely  general,  well- 
understood  architectures  that  can  be  easily  analyzed  to  provide  predictable  response  time  in  a  dis¬ 
tributed  or  parallel  environment,  in  contrast  to  such  techniques  as  rate-monotonic  analysis  or 
cyclic  executives  in  uniprocessors.  These  simple  uniprocessor  techniques  can  be  used  for  the  indi- 
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vidual  processors  in  distributed  parallel  nodes,  but  analyzing  the  resulting  system- wide  response 
requires  decomposing  the  external  end-to-end  time  constraints  into  smaller  individual  constraints 
on  partial  computations,  which  is  difficult  and  results  in  significantly  sub-optimal  resource  utiliza¬ 
tion  throughout  the  architecture. 

A  key  paradigm  to  render  system  development  cost  manageable  in  today’s  large  systems  is  reuse. 
With  a  few  exceptions,  the  impact  of  software  component  reuse  in  the  real-time  systems  under 
consideration  in  this  workshop  has  been  very  low  in  the  software  architecture,  design  and  imple¬ 
mentation,  but  this  must  change.  This  required  increase  in  the  use  of  reusable  components  in  these 
systems  provides  a  further  incentive  to  the  creation  of  well-understood  software  architectures  for 
large,  distributed,  parallel  real-time  systems,  since  there  can  be  little  utilization  of  reuse  on  the 
required  scale  unless  the  underlying  software  architectures  are  compatible. 

SCHEDULING  THEORY  VERSUS  DESIGN  THEORY 

Question  2:  What  should  be  the  relationship  between  real-time  Design  Theory  and  real-time 
Scheduling  Theory  in  a  design  methodology  for  this  class  of  systems? 

The  fact  that  scheduling  theory  assumes  knowledge  of  periodicity  and  execution  times  is  not 
indicative  of  an  implied  assumption  that  the  system  has  “already  been  designed”,  but  is  merely  a 
reflection  of  the  fundamental  nature  of  real-time  systems,  in  the  same  sense  as  the  assumption  in  a 
missile  design  that  the  mass  of  the  missile  is  known.  In  either  case,  even  though  the  evenmal 
design  will  certainly  require  changes  and  re-analysis  throughout  design  and  implementation,  there 
is  no  substitute  for  early  estimation  and  tracking  of  these  parameters.  There  can  be  no  assurance 
of  the  ability  of  a  design  to  meet  its  timing  constraints,  either  before  or  after  the  design  is  com¬ 
pleted,  regardless  of  the  amount  of  testing,  without  this  information.  Note  that  the  same  is  true  of 
the  cost  of  the  system;  obviously  the  cost  cannot  be  precisely  determined  before  the  design  and 
implementation  is  completed,  but  it  will  certainly  be  estimated  as  accurately  as  possible  before 
any  substantive  technical  effon  is  expended,  and  tracked  (and  updated)  throughout  the  design  and 
implementation  of  the  system. 

Additionally,  the  architecture  that  results  from  the  answers  to  the  architectural  questions  above 
will  have  a  critical  impact  on  the  ability  of  the  resulting  design  to  meet  its  time  requirements. 
Unlike  many  other  design  attributes,  the  ability  to  meet  timing  constraints  is  principally  deter¬ 
mined  when  the  top-level  architecture  is  defined,  rather  than  in  the  myriad  details  that  make  up  the 
remainder  of  the  design.  This  is  true  because  of  the  tight  coupling  between  the  scheduling  (i.c., 
the  sequencing  of  all  system  resources  to  meet  time  constraints)  design  and  the  successful  perfor¬ 
mance  of  the  system. 

Thus,  for  the  class  of  large,  distributed,  parallel  real-time  systems  that  are  the  subject  of  this  work¬ 
shop,  it  is  critical  that  the  Design  Theory  be  heavily  influenced  by  the  Scheduling  Theory.  The 
present  immaturity  of  both  for  such  systems  is  strongly  apparent,  and  is  particularly  evidenced  by 
the  fact  that  questions  such  as  this  are  even  being  asked.  *nie  failure  of  a  number  of  such  systems 
that  have  been  built  over  the  last  several  years  to  meet  their  time  constraints  provides  abundant 
evidence  that  it  is  the  failure  to  create  a  real-time  software  architecture  at  the  beginning  of  the 
design  process  that  has  led  to  designs  incapable  or  barely  capable  of  meeting  time  constraints. 

As  an  example,  consider  the  pipeline  architecture  which  is  frequently  used  for  such  systems.  In 
this  architecture,  processing  each  input  consists  of  a  sequence  of  tasks,  interconnected  using  any 


of  several  message  passing  techniques.  Such  an  architecture  seems  intuitively  simple  due  to  the 
presence  of  multiple  nodes  in  the  distributed  environment,  and  leads  to  significant  amounts  of 
concurrency  and  resulting  efficiency.  However,  such  designs  do  not  meet  the  basic  requirements 
of  any  of  the  known  scheduling  theories,  all  of  which  involve  ensuring  preference  (e.g.,  priorities) 
to  tasks  with  tighter  time  constraints,  and  explicit  control  over  inversions  of  this  preference,  par¬ 
ticularly  in  the  presence  of  the  COTS  “open”  systems  components  frequently  proposed  for  them. 

Thus,  combining  Design  Theory  and  Scheduling  Theory,  for  this  class  of  systems,  is  mandatory, 
and  should  produce  a  Real-Time  Architecture  Theory  which  can  then  make  extensive  use  of  exist¬ 
ing  methodologies  for  subsequent  decomposition  into  objects,  abstract  data  types,  or  other  popu¬ 
lar  constructions.  The  appropriate  constructions  for  real-time  systems  can  include  only  those  that 
are  analyzable  using  a  sound  Scheduling  Theory  if  unrecognized  real-time  performance  problems 
are  to  be  avoided. 

REAL-TIME  VALIDATION  AND  VERIFICATION 

Question  3:  What  is  the  best  method  for  validating  that  large,  distributed,  parallel  architecture 
real-time  systems  behave  as  specified? 

The  combination  of  Design  and  Scheduling  Theories  is  panicularly  critical  to  the  ability  to  vali¬ 
date  the  timing  performance  characteristics  of  a  distributed  real-time  system  for  the  same  reason 
that  Design  Tlseory  alone  is  critical  in  validating  the  correct  functional  behavior  of  large,  complex 
systems.  It  is  well  known  that  software  cannot  be  validated  for  correemess  through  testing  alone; 
this  is  especially  a  problem  when  systems  become  very  large.  The  problems  of  time  correemess 
are,  if  anything,  even  more  intractable,  because  a  system  with  response  time  constraints  will  fre¬ 
quently  behave  properly  a  large  part  of  the  time,  failing  only  intermittently.  In  fact,  timing  failures 
of  real-time  systems  are  almost  always  transitory  (i.e.,  intermittent),  with  very  similar  characteris¬ 
tics  as  intermittent  hardware. 

This  fundamental  intractability  in  the  ability  to  test  a  system  for  meeting  its  response  time  con¬ 
straints  renders  it  even  more  important  to  ensure  that  the  software  architecture  conforms  to  a  valid 
theoretical  timing  model  that  will  make  the  analysis  of  the  system  tractable.  Only  in  this  way  can 
the  user  be  sure  that  not  only  does  the  system  generally  perform  correctly,  but  that  it  is  likely  to 
continue  to  perform  correctly,  even  in  the  presence  of  heavy  loads  (including  overloads). 

This  does  not  mean,  however,  that  there  is  no  role  for  testing.  The  significance  of  testing  relative 
to  time  constraints  is  to  validate  that  the  implemented  system  faithfully  meets  the  individual  time 
constraints  of  its  components,  leading  to  the  assurance  that  the  system  as  a  whole  will  have  the 
timing  characteristics  inherent  in  its  design  model.  Similarly,  the  testing  must  verify  that  the  sys¬ 
tem  faithfully  conforms  to  its  architectural  structure.  Thus,  for  example,  if  the  system  architecture 
specifies  that  Task  A  sends  messages  only  to  Task  B  and  Task  C,  that  only  those  inter-task  mes¬ 
sages  are  transmitted  by  Task  A,  and  that  the  message  sizes  fall  within  the  limits  defined. 

DEVELOPING  LARGE,  DISTRIBUTED  REAL-TIME  SYSTEMS  IN  THE 
FUTURE 

Question  4:  Given  that  resources  were  available  to  enhance  the  design  and  testing  methodologies 
for  this  class  of  systems,  what  arc  the  most  promising  areas  where  these  resources  could  be 
applied? 


Qearly,  developing  lai^ge,  distributed,  parallel  real-time  systems  successfully  is  an  extremely 
complex  process.  The  research  domains  most  likely  to  yield  important  results  would  be  those 
attacking  the  software  architectural  issues  described  previously  in  this  paper.  Important  research 
has  been  ongoing  for  some  years  in  the  individual  areas  of  real-time  synchronization,  communica¬ 
tions,  processor  scheduling,  memory  management,  and  cache  management,  but  additional  work  in 
combining  results  from  these  areas  into  coherent  architecniral  strategies  could  help  greatly.  The 
disciplines  required  must  combine  scheduling,  software  engineering,  and  fault  management  at  a 
minimum.  The  skills  and  experience  base  needed  for  such  a  research  program  would  likely  be 
available  only  through  a  team  composed  of  both  academic  and  development  personnel. 
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This  position  paper  addresses  the  four  issues  listed  in  the  announcement  for 
the  Workshop  on  Large,  Distributed,  Parallel  Architecture,  Real-Time  Systems 
to  be  held  at  IDA  March  15-19,  1993. 

The  positions  stated  on  the  various  issues  include  experiences  gained 
and  directions  taken  within  Hughes  regarding  the  system  development  process, 
design  methodologies,  and  the  use  of  CASE  tools.  Hughes  is  a  large  and 
diverse  company  and  the  stated  positions  do  not  necessarily  reflect  an  overall, 
uniform  company  policy  with  respect  to  methodology  and  tools. 

Most  of  the  language  dependent  design  and  implementation  issues  are 
directed  to  Ada  real-time  systems,  since  Ada  is  the  prevalent  programming 
language  used  on  a  significant  number  of  large,  distributed,  real-time  systems 
being  built.  Ada  also  has  a  tasking  model  included  at  the  application 
programming  level. 

In  discussions  addressing  general  concurrent  elements,  e.g.,  Unix  or 
VMS  processes,  the  term  process  is  used.  Concurrent  elements  in  Ada  are 
referred  to  as  tasks. 
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1.  What  is  the  best  method  or  methodology  for  designing  large, 
distributed  real-time  systems  where  processing  elements  may  have 
a  pakaliei  architecture? 

It  is  not  sufficient  to  only  consider  a  design  methodology  for  the  development  of 
large,  distributed  real-time  systems.  The  scope  should  be  expanded  to  include 
an  integrated  system  development  process.  The  process  will  employ  several 
methodoiogies  in  the  design  and  implementation  of  distributed  systems.  From 
an  organizational  point  of  view,  systems  engineers,  hardware  engineers,  and 
software  engineers  should  be  involved  in  the  process,  i.e.,  concurrent 
engineering. 

Embedded  within  this  engineering  process  must  be  a  set  of  consistent 
graphical  representations  that  can  be  used  to  clearly  identify  products  of 
individual  steps  in  the  process,  and  that  will  clearly  show  the  transitioning 
activities  from  one  step  to  the  next.  A  system  development  process  that 
embraces  concurrent  engineering  should  include  the  following  steps: 

1.  Domain  analysis. 

2.  System  requirements  analysis 

3.  System  design 

a.  Partitioning  into 

•  subsystems 

•  hardware  and  software  components 

•  reusable  software  components 

b.  Configuring  (allocating  requirements  to  hardware  and  software) 

4.  Hardware  design 

5.  Software  requirements  analysis  (for  each  partition) 

6.  Software  design 

a.  Process  structuring  (concurrency  model) 

b.  Language  dependent  design,  e.g.,  for  Ada 

•  Ada  task  structuring 

•  class/object  structuring  (Ada  packaging) 

c.  Software  design  evaluation 
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Two  primary  development  approaches  are  currently  in  use  for  large, 
distributed  systems.  The  hardware-first  approach  utilizes  all  of  the  steps  listed 
above  In  the  order  suggested.  There  Is  a  specific  partitioning  step  to  divide  a 
large  system  into  a  set  of  suitable  partitions,  followed  by  (and  iterated  with)  a 
configuring  step  which  allocates  system  requirements  to  hardware  and  software 
entities  (modules).  The  requirements  represented  within  each  module  are 
designed  and  implemented  in  the  specific  programming  language  as  virtual 
nodes  (VNs)  that  are  mapped  to  the  hardware  elements.  In  Ada,  for  example, 
VNs  are  collections  of  Ada  packages,  subprogram  bodies,  and  task  bodies  that 
implement  the  set  of  requirements  allocated  to  a  hardware  element  during  the 
partitioning/configuring  activity.  This  is  the  traditional  development  approach  for 
decomposing  a  large  system  into  partitions  that  can  be  implemented  as 
distributed  processing  elements  (PEs). 

The  software-first  approach  skips  the  partitioning/configuring  step  of 
allocating  requirements  to  hardware  and  software.  The  software  requirements 
analysis  is  performed  on  a  system  wide  basis  (rather  than  for  each  partition), 
and  software  elements  are  developed  before  the  hardware  architecture  has 
been  determined.  A  hardware  architecture  is  developed  to  support  the  software 
solution,  and  VNs  are  mapped  to  the  hardware  elements.  This  approach  can 
be  used  successfully,  for  example,  for  the  development  of  a  product  line,  where 
small,  medium,  and  large  systems  will  be  constructed  within  the  same  problem 
domain.  Different  hardware  architectures  are  developed  for  the  different  size 
systems,  and  the  software  representing  each  system  is  composed  of  reusable 
software  components. 

A  system  development  process  and  associated  methodologies 
supporting  the  six  steps  outlined  above  include: 

1.  ART  [HAC92]  ~  This  is  an  integrated  system  development  process  that 
addresses  all  of  the  six  steps  outlined  above.  Domain  analysis  is  included 
as  the  first  activity  to  support  reusability  in-the-large  [NIE92]. 

2.  Real-Time  Structured  Analysis  (RTSA)  -  This  is  used  as  a  methodology  to 
support  both  the  system  requirements  analysis,  system  design,  and  software 
requirements  analysis.  The  methodology  and  notation  is  taken  from  Hatley- 
Pirbhai  [HAT88],  with  supplemental  information  from  Ward-Mellor  [WAR85a, 
WAR85b,  and  MEL86],  and  Shumate-Keller  [SHU921. 
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3.  infomnation  Modeling  -  An  information  model  is  added  to  supplement  the 
Hatley-Pirbhai  process  and  control  models.  The  notation  for  the  Entity 
Relationship  Diagrams  (ERDs)  is  based  on  the  description  in  [CHE76]. 

4.  OOD  with  Ada  ~  This  is  an  object-based  Ada  design  methodology  for  large 
real-time  systems  [NIE88,  NIE92.  and  SHU88a].  The  transitioning  from 
analysis  to  design  is  based  on  DARTS  [GOM84].  Specific  guidelines  for 
creating  a  process  abstraction  and  Ada  task  structuring  is  included 
[SHU88b].  The  creation  of  VNs  is  based  on  certain  structuring  guidelines  for 
distributed  Ada  programs  [ATK88,  JHA89,  VOL89,  NIE90], 

The  primary  CASE  tools  used  to  support  ART  include  Software  Through 
Pictures  (StP),  developed  by  IDE.  and  Teamwork,  developed  by  Cadre. 
Neither  of  these  two  tools  is  fully  integrated  to  support  the  complete 
development  process.  Both  vendors  are  currently  working  to  improve  the  tool 
support  for  additional  steps  of  the  development  process. 

It  should  be  noted  that  none  of  the  development  steps  or  tools  mentioned 
support  hardware  design. 


2.  What  should  be  the  relationship  between  real-time  Design 
Theory  and  real-time  Scheduling  Theory  in  a  design  methodology 
for  this  class  of  systems? 

Real-time  design  theory  refers  to  a  collection  of  features  that  pertain  to  a 
concurrency  model  of  multiple  processes  executing  in  parallel  on  distributed 
PEs.  Processes  may  also  compete  for  the  use  of  the  same  PE,  i.e.,  apparent 
concurrency,  as  opposed  to  processes  executing  in  different  PEs  with  real 
concurrency.  We  need  to  be  able  to  handle  both  conditions.  Some  of  the 
features  of  real-time  design  theory  fBEN82,  LEV90J  that  apply  to  scheduling 
include: 

•  Safety  -  a  concurrent  element  (process)  must  perform  correctly  independent 
of  the  other  processes  in  the  system. 

•  Liveness  ~  two  or  more  processes  must  exhibit  the  correct  behavior  in  the 
dynamic  environment  of  asynchronous  execution. 
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•  Adequate  response  time  for  critical  events  -  e.g.,  interrupts  must  be  serviced 
in  a  timely  fashion. 

•  Schedulability  -  processes  must  be  scheduled  to  execute  and  complete 
their  functions  under  certain  system  dependent  time  constraints. 

•  Overload  response  -  if  all  the  processes  in  the  system  cannot  meet  their 
deadlines,  the  selected  critical  processes  must  still  be  serviced  to  meet  their 
deadlines. 

•  Mutual  exclusion  -  certain  sequences  of  instructions  will  be  expected  to 
execute  within  a  critical  section.  This  does  not  apply  directly  to  scheduling 
events,  but  applies  to  distributed  databases  and  multiple  access  of  shared 
data,  and  is  important  during  synchronization  of  processes. 

•  P'iority  inheritance  -  the  precise  timing  of  a  priority  assigned  to  a  process 
that  will  execute  next  during  synchronization  of  two  processes  with  different 
priorities  (e.g.,  in  Ada  the  priority  of  the  highest  task  is  assigned  at  the  start  of 
the  rendezvous). 

Real-time  scheduling  theory  forms  the  basis  for  implementing  the 
scheduling  features  of  the  real-time  design  theory.  It  also  allows  the 
implementors  of  real-time  systems  to  analyze  timing  correctness  and  make 
predictions  about  the  expected  system  performance.  Elements  of  real-time 
srhGdi'iinn  theory  [SHA90,  and  LEV901  include: 

•  Scheduling  mechanism  -  can  be  based  on  round-robin,  time-slicing 
(deterministic)  or  a  'fair*  selection  method  with  preemption  (non- 
deterministic). 

•  Scheduling  algorithms  -  used  to  predict  whether  or  not  time-critical 
processes  will  complete  execution  within  required  timing  restrictions.  The 
most  prevalent  of  these  is  hard  deadline  scheduling  based  on  a  set  of 
periodic  processes:  Rate-monotonic  algorithm  where  the  processes  are 
assigned  priorities  in  reverse  relation  to  their  periodicities  (shortest  period 
given  highest  priority). 

•  Fairness  in  scheduling  —  can  be  implemented  by  dynamically  increasing  the 
priority  of  a  process  as  the  criticality  increases  (e.g.,  making  a  controlled 
object  avoid  hitting  an  obstacle). 
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The  current  design  methodologies  should  be  expanded  to  include 
guidelines  for  dealing  with  real-time  considerations  (e.g.,  safety,  liveness, 
schedulability,  etc.)-  Specific  heuristics  should  be  developed  for  handling  the 
scheduling  of  concurrent  processing  elements.  This  should  include  aperiodic 
as  well  as  periodic  tasks.  These  expanded  design  guidelines  should  be  based 
on  known  scheduling  theories  that  cati  reasonably  be  expected  to  be  available 
in  run-time  models,  e.g.,  the  current  preemptive  Ada  tasking  model  and 
proposed  rate-monotonic  mechanisms. 

The  design  theory  we  use  in  our  design  methodology  must  be 
implemented  in  the  semantics  of  the  process  abstraction  model.  For  example,  it 
does  not  make  sense  to  try  to  implement  critical,  periodic  tasks  in  the  current 
version  of  Ada-1983  which  has  a  run-time  implementation  of  an  "at  least* 
condition  for  the  expiration  of  a  periodic  task.  It  is  important  that  design  theory 
methodologists  and  run-time  implementers  communicate  effectively  about  their 
respective  needs  and  possible  run-time  implementation  problems.  As  future 
improvements  of  run-time  systems  are  developed  for  real-time  systems,  we 
must  avoid  the  frustrating  situations  of  the  early  Ada  systems  when  the 
designers  tried  to  implement  designs  based  on  preemptive  scheduling  that  did 
not  exist  in  the  implementation  model.  The  run-time  developers  had  Interpreted 
the  Ada  Reference  Manual  to  mean  that  preemption  was  not  required. 


3.  What  is  the  best  method  for  validating  that  large,  distributed, 
parallel  architecture  real-time  systems  behave  as  specified? 

The  validation  of  large,  distributed  real-time  systems  includes  three  primary 
elements:  (1)  a  set  of  plans  and  procedures  for  how  the  validation  is  to  be 
performed:  (2)  a  training  plan  to  ensure  that  the  developers  are  implementing 
the  test  plans  and  procedures  in  a  consistent  manner;  and  (3)  a  set  of  modeling 
and  test  tools  to  support  the  validation  process. 

DoD-Std-2167A  has  received  considerable  (justified)  criticism  with 
regard  to  a  literal  interpretation  of  the  contents  and  order  of  its  numerous 
analysis  and  design  documents,  and  the  implication  that  it  tends  to  dictate  a 
design  methodology.  Such  a  literal  interpretation  should,  however,  be 
encouraged  for  the  2167A  set  of  test  documents  which  include  a  Software  Test 
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Plan,  Software  Test  Description,  and  Software  Test  Report.  An  early  focus  of 
the  contents  of  these  documents  (even  for  commercial  projects  where  2167A  is 
not  required)  forces  attention  to  the  test  phase  as  a  process.  A  description  of 
test  cases  is  prepared  before  the  actual  test  phase  begins,  and  helps  to  identify 
the  efforts  required  for  unit  testing  and  integration  testing.  Particular  attention 
should  be  paid  to  test  cases  and  validation  procedures  to  analyze  the  system 
for  deadlock,  starvation,  data  integrity,  schedulability  of  processes,  and 
communication  performance. 

A  significant  amount  of  training  may  be  required  before  the  test  phase 
begins  to  ensure  that  the  developers  understand  the  kind  of  testing  required, 
and  that  they  will  be  following  the  test  procedures.  This  is  even  more  important 
for  distributed  real-time  systems  where  considerable  challenges  are  presented 
for  validating  real-time  performance  requirements.  In  many  cases  we  are 
finding  that  the  developers  are  merely  trying  to  get  a  system  to  run  during  the 
test  phase,  when  they  should,  instead,  be  testing  a  running  system. 

There  is,  unfortunately,  no  unique  list  of  support  tools  that  will  guarantee 
the  complete  validation  of  a  distributed  real-time  system.  Useful  test  tools 
include  code  analyzers,  static  analyzers,  test  probes,  and  modeling  tools. 
Formal  methods  embedded  within  these  tools  must  be  clearly  understood,  in 
particular,  with  regard  to  their  limitations.  Results  in  the  form  of  metrics  must  be 
used  sensibly,  and  do  not  represent  ‘proof  of  correctness.”  The  use  of 
automated  test  tools  should  be  encouraged  during  the  entire  development 
period  to 

The  overall  test  philosophy  should  be  based  on  finding  bugs  as  early  as 
possible  in  the  development  cycle.  Hie  validation  process  should  occur 
throughout  the  development  cycle  to  give  us  a  better  product  delivered  to  the 
customer.  To  support  this  philosophy,  we  need  a  consistent  error  reporting 
mechanism  throughout  the  development  cycle.  A  concentration  of  design  and 
coding  errors  in  certain  functional  areas  will  focus  our  testing  efforts  to  those 
areas  (but  not  to  the  detriment  of  other  areas). 
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4.  Given  that  resources  were  available  to  enhance  the  design  and 
testing  methodologies  for  this  class  of  systems,  what  are  the  most 
promising  areas  where  these  resources  could  be  applied? 

4.1  Design  Methodologies  and  Tools 

The  primary  key  to  reusable  designs  in  distributed  systems  is  the  degree  of 
transparency  of  the  inter-process  and  inter-processor  communication  (IPC) 
mechanism.  Most  of  the  distributed  systems  implemented  today  are  designed 
with  a  unique  IPC  mechanism  of  the  •not-invented-here*  variety.  This  is  also 
true  to  some  extent  for  the  underlying  communication  protocols  regarding  the 
number  of  layers  that  are  implemented. 

Specific  design  guidelines  should  be  established  for  the  creation  of 
standardized  IPC  mechanisms  based  on  the  required  functionality,  e.g., 
broadcast,  synchronous  and  asynchronous  connection-oriented 
communication,  multicast,  etc.  The  guidelines  should  include  the  use  of 
message  passing,  remote  procedure  calls  (RPC),  remote  entry  calls  (REC),  and 
the  use  of  shared  data  in  heterogeneous  and  homogeneous  architectures. 

A  set  of  standardized  interfaces  (bindings)  should  be  developed  for  Ada, 
C,  and  C++  programs  for  each  of  the  IPC  mechanisms  developed.  This  will 
promote  truly  reusable  programs  at  the  application  interface  level. 

Design  guidelines  should  be  developed  for  implementing  Ada,  C,  C++, 
and  mixed  language  programs  in  distributed  architectures.  This  should  include 
alternatives  to  the  use  of  the  Ada  tasking  model. 

Design  guidelines  should  be  developed  for  distributed  database  design 
including  a  set  of  standardized  locking  mechanisms. 

Funds  should  be  made  available  to  support  the  major  tool  vendors  for 
improving  existing  CASE  tools  for  the  most  promising  system  design 
methodologies. 


4.2  Testing  Methodologies  and  Tools 

An  important  element  of  validating  large,  distributed  real-time  systems  is  the  use 
of  prototyping.  The  traditional  method  of  prototyping  includes  the  use  of  throw¬ 
away  code  as  the  complete  system  is  implemented.  A  better  approach  is  to 
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develop  a  set  of  prototyping  tools  that  can  aid  in  the  debugging  and 
understanding  of  the  system  to  be  implemented,  without  developing  throw-away 
code.  An  example  of  such  a  tool  is  a  device  to  simulate  a  particular  bus  or  LAN 
interface,  e.g.,  Mil-Std-1553  or  Ethernet.  This  device  will  simulate  the  bus  or 
LAN  functions  (e.g.,  the  arbitration  mechanism)  and  record  the 
stimulus/response  activity  with  the  distributed  system.  Such  a  device  can  be  an 
invaluable  aid  in  understanding  the  distributed  system  in  terms  of  transient 
functions  like  startup,  restart,  and  error  detection  and  recovery. 

A  set  of  test  cases  has  been  developed  for  measuring  the  performance  of 
real-time  features  in  Ada  programs  In  uniprocessor  architectures  (available  from 
SIGAda's  PIWG).  A  similar  set  of  test  cases  could  be  developed  for  measuring 
the  performance  of  programs  in  multiprocessor  architectures.  Particular 
performance  features  could  Include  communication  time  latency  and  the 
efficiency  of  the  IPC  mechanism.,  and  schedulability.  Special  code  analyzers 
couid  be  developed  to  predict  the  potential  for  deadlock  and  starvation.  The 
reduction  of  these  real-time  risk  areas  before  testing  starts  would  greatly 
enhance  the  validation  process. 

Dynamic  analyzers  can  be  developed  to  measure  the  performance  of  a 
distributed  system  in  a  truly  asynchronous  environment.  The  currently  available 
static  analyzers  don't  help  us  here. 
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Large-Scale  Distributed  Real-Time  Computing 
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1.0  Introduction 

Real-dme  computing  and  communication  systems  are  critical  to  an  industrialized  nation’s  techno¬ 
logical  infrastructure.  Modem  telecommunicadcm  systems,  automated  factories,  defense  systems 
and  air-traffic  control  systems  cannot  operate  without  theirL  Indeed,  real-time  computing  and 
communication  systems  control  the  very  systems  that  keep  us  productive,  make  our  manufactur¬ 
ing  processes  competitive,  enhance  our  security,  and  enable  us  to  explore  new  frontiers  of  science 
and  engineering.  The  explosive  growth  of  applications  executed  dependably  in  real  time  is  envi¬ 
sioned  in  diverse  areas  such  as  batdefield  simulations,  C^I  systems,  high-bwdwidth  multimedia 
communications,  and  distributed  flexible  manufacturing. 

The  key  requirements  for  advanced  real-time  systems  are  predictability,  dependability  and  perfor¬ 
mance.  A  r^-time  system’s  timing  behavior  should  be  predictable  before  it  is  developed  or  mod¬ 
ified.  The  system  must  have  the  ability  to  tolerate  the  failure  of  individual  subsystems  and  provide 
a  high  degree  of  performance.  The  most  significant  developments  in  these  areas  are  the  general¬ 
ized  rate-monotonic  scheduling  theory  that  provides  a  theoretical  foundation  for  the  development 
of  predictable  real-dme  systems,  the  membership-based  fault-tolerance  protocols  that  allow  flexi¬ 
ble  management  of  redundant  resources,  gigabit  networking  technology,  high-performance  RISC 
processors  and  parallel  processing  architectures. 

The  DoD  1991  Software  Technology  Strategy  document  refers  to  RMS  as  a  “major  payofiP’  and 
states  that  “System  designers  can  use  this  theory  to  predict  whether  task  deadlines  be  met 
long  before  the  costly  implementation  phase  of  a  project  begins.  It  also  eases  the  process  of  mak¬ 
ing  modifications  to  application  software,...’’  The  Acting  Deputy  Administrator  of  NASA  recently 
stated  in  a  1992  speech  entitled  Charting  The  Future,  “Through  the  development  of  Rate  Mono¬ 
tonic  Scheduling,  we  now  have  a  system  that  will  allow  (Space  Station)  Freedom’s  computers  to 
budget  their  time,  to  choose  between  a  variety  of  tasks,  and  decide  not  only  which  one  to  do  first 
but  how  much  time  to  spend  in  the  process.’’  The  RMS  approach  is  also  cited  in  the  Selected 
Accomplishments  section  of  the  National  Research  Council’s  1992  report,  A  Broader  Agenda  for 
Computer  Science  arul  Engineering.  Our  GRMS  approach  has  also  been  rapidly  gaining  accep¬ 
tance  in  the  industry,  and  has  been  applied  to  national  high-technology  projects  such  as  BSY-1, 
BSY-2  and  NASA’s  Space  Station.  Scheduling  support  for  the  use  of  generalized  RMS  can  now 
be  found  in  major  national  hardware  and  software  standards  such  as  Ada  9x,  TFPF  POSIX.4,  and 
the  IEEE  Futurebus-t-  bus  standard. 

To  advance  the  state  of  real-time  computing,  it  is  important  to  build  upon  these  successes.  Thus, 
we  propose  to  extend  GRMS  [2,3,4,7]  in  the  context  of  a  very  large-scale  distributed  computing 
system  where  the  communication  delays  make  it  impossible  for  each  scheduler  to  have  timely  and 
complete  system  state  information.  Furthermore,  we  must  create  a  unified  framework  for  high- 
performance  real-'tune  fault-tolerant  computing.  This  unified  framework  should  provide  an  appli¬ 
cation  infrastructure  that  employs  high-performance  computers  and  networks.  Such  systems  must 
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handle  high-volunie  synchronized  video,  audio  and  text  as  well  as  real-time  data  streams  with  dif¬ 
ferent  periodicities  and  latency  requirements  from  radar,  hydrophones,  satellite  and  other  mea- 
suretrtent  instruments.  Users  of  these  systems  can  virtually  visit  different  geographical  locations, 
and  get  a  “first-hand  view”  of  the  situation.  Global  assessment  can  be  facilitated  by  gathering 
information  ftom  different  points  at  a  single  decision  point  In  large-scale  complex  systems,  com¬ 
ponent  failures  and  application  software  errors  are  inevitable.  The  ability  of  the  unified  frame¬ 
work  to  deal  with  software  and  hardware  errors  can  greatly  enhance  the  reliability,  functionality 
and  flexibility  of  C^I  systems,  air  traffic  control  systems,  trKxlem  mass-transportation  systems, 
automated  factories  and  defense  applications  such  as  nationwide  battlefield  simulations. 

2.0  Fundamental  Challenges 

To  develop  an  application  infirastructure  for  large-scale  distributed  real-time  coiiipu^ing,  there  are 
some  fundamental  challenges  that  we  must  meet. 

•  Decisions  in  this  distributed  environment  must  be  made  in  decentralized  fashion  from  both 
dependability  and  performance  points  of  view.  However,  due  to  the  geographical  distribution 
of  subsystems,  propagation  delays  can  be  excessive,  and  decisions  must  be  based  on  delayed 
and  sometimes  even  incomplete  information.  Nevertheless,  the  distributed  scheduling  actions 
must  be  consistent 

•  Dependability  requires  that  individual  subsystem  faults  do  not  crash  the  entire  system.  In  par¬ 
ticular,  two  critical  yet  complementary  aspects  must  be  addressed.  First  an  approach  to  deal 
with  the  increasingly  serious  problem  of  application  software  errors  is  necessary.  Secondly,  we 
need  the  analytical  foundations  and  system  primitives  to  deal  with  failures  of  hardware  and 
software  resources. 


2.1  Maintaining  Coherence  in  Distributed  Scheduling  Actions 
As  network  speed  and  the  physical  distances  between  nodes  increase,  the  state  of  the  system  is 
distributed.  This  has  been  recognized  as  a  key  problem  of  the  Core  CS&.E  research  Agenda  for  the 
Future  of  the  1992  National  Research  Council’s  report  titled,  A  Broader  Agenda  for  Computer 
Science  and  Engineering.  It  states,  “A  network  is  an  interconnected  system,  with  many  possible 

paths  for  feedback  to  any  given  node . the  inability  to  predict  just  when  these  feedback  effects 

will  occur  presents  many  problems  for  system  designers  concerned  about  avoiding  catastrophic 
positive  feedback  loops  that  can  rapidly  consume  all  available  bandwidth”. 

Fortunately,  we  have  already  solved  this  problem  for  the  special  case  of  a  wide  area  dual-link  net¬ 
work  [5].  A  dual-link  network  consists  of  two  unidirectional  links  carrying  traffic  in  fixed  size 
cells  in  opposite  directions.  This  can  be  considered  as  a  special  case  of  a  network  of  switches  with 
only  two  connections.  The  bandwidth  usage  requirements  of  downstream  stations  is  fed  back  by 
setting  a  request  bit  in  a  cell  flowing  in  the  opposite  direction.  However,  such  feedback  is  delayed 
due  to  large  distance.  Furthermore  the  bandwidth  requirements  of  upstream  stations  will  never  be 
known  by  the  downstream  stations.  Thus,  stations  in  a  large  high  speed  network  must  maif^ 
scheduling  decisions  with  incomplete  and  delayed  information.  The  challenge  is  to  achieve  pre¬ 
dictable  operation  under  these  circumstances. 

We  have  developed  a  theory  of  coherent  dual-link  networks  and  a  coherent  scheduling  protocol 
that  ensures  that  the  system  will  be  consistent  despite  its  distributed  state  [5].  Under  this  protocol. 
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traffir  in  a  dual-link  network  is  transmission-schedulable^  when  an  equivalent  centralized  system 
is  schedulable.  It  is  important  to  generalize  this  notion  to  an  arbitrary  network  topology. 

2.2  Software  Fault  -Tolerance  and  Analytic  Redundancy 

Statisdcs  of  large  computing  systems  show  that  the  probability  of  a  system  failure  due  to  software 
bugs  is  about  ten  times  that  due  to  hardware  faults.  Therefore,  in  brief,  we  must  be  able  to  deal 
wiA  software  errors  to  have  a  reliable  real-time  system.  To  deal  with  software  faults,  some  form 
of  redundancy  in  computation  is  needed.  Direct  redundancy  uses  different  programs  to  compute 
the  Mine  residts  so  that  voting  or  mid-value  selection  can  be  used.  An  exan^rle  of  direct  redun¬ 
dancy  for  software  fault  tolerance  is  N-version  programming.  The  fundamental  problem  widi 
direct  redundancy  is  that  it  is  costly  and  independently  developed  software  can  still  have  comnoon 
errors.  The  source  of  software  faults  is  complexity.To  successfully  deal  with  software  faults,  we 
must  let  sinq)licity  control  complexity. 

This  approach  is  realized  by  the  use  of  analytic  redundancy  [6].  Under  this  approach,  programs 
with  different  complexities  will  compute  different  results  that  are  analytically  related.  Particu¬ 
larly,  w^  will  develop  a  trusted  simple  software  system  which  will  give  us  a  baseline  answer  on 
time  plus  a  set  of  confidence  assertions.  A  confidence  assertion  is  a  generalization  of  the  statistical 
concept  of  confidence  interval,  which  creates  an  “envelope"  within  which  the  solutions  from  the 
complex  must  lie.  The  complex  software  is  not  trusted.  Its  outputs  must  be  consistent  widi  the 
confidence  assertions  produced  by  the  simple  software  or  they  will  be  discarded.  Furthermore,  we 
are  not  even  able  to  trust  the  computation  process  employed  by  the  coti^lex  software,  which  may 
have  serious  bugs  that  can  crash  the  computation  process  itself.  ' 

To  illustrate  the  use  of  analytic  redundancy,  we  found  that  it  was  useful  to  classify  software  enors 
into  three  types  for  the  purpose  of  detection  and  recovery.  We  shall  use  tracking  as  an  example  to 
illustrate  this  concept 

•  Inaccuracy  :  In  the  context  of  our  applications,  tiiese  are  tracking  errors,  which  are  a  function  of 
the  quality  of  the  data  and  the  sophistication  of  the  tracking  algorithm.  Due  to  the  nature  of  the 
application,  such  errors  can  only  be  reduced  but  not  eliminated.  Design  or  implementation 
errors  in  software  development  can  also  contribute  to  tracking  errors. 

•  Timing  faults:  These  typically  occur  in  the  form  of  missed  deadlines.  While  software  design 
and  implementation  errors  may  lead  to  timing  faults,  a  major  source  of  timing  faults  is  the 
time-complexity  of  the  algorithms.  The  difficulty  in  tracking  applications  is  that  sophisticated 
algorithms  may  reduce  the  number  of  tracking  errors  but  contribute  to  timing  faults. 

•  Programming  system  faults:  These  are  those  serious  software  faults  that  may  crash  tiie  system, 
for  example,  illegal  addresses  or  data,  exhausting  available  buffers,  monopolizing  the  I/O 
channels  and/or  CPU. 

Figure  1  is  a  model  which  compares  the  characteristics  of  a  simple  software  system  with  those  of 
a  complex  software  system. 

The  software  architecture  used  to  deal  with  these  faults  is  known  as  the  Simplex  Software  Archi¬ 
tecture.  This  architecture  employs:  (1)  analytic  redundancy  to  guard  against  application  level  soft¬ 
ware  errors,  (2)  runtime  isolation  and  fault  containment  techniques  to  guard  against  programming 
system  level  software  errors  such  as  illegal  addressing,  and  (3)  generalized  rate  -monotonic 
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FIGURE  1.  Conceptual  Model  to  buOd  Dependable  Real*Tinie  Systems 

scheduling  techniques  to  guard  against  anung  errors.  Hguie  1  is  the  conceptual  model  that  ilius- 
trates  the  combined  use  of  scheduling,  runtime  fault  containment  and  analytic  redundancy  to 
improve  the  overall  functional  performance  and  reliability. 

23  System-Level  Failure  Management  in  Distributed  Real-Time  Systems 
Distributed  gigabit  network  based  real-time  systems  must  be  robust  and  fault-tolerant  However, 
the  construction  of  distributed  fault-tolerant  real-time  systems  bring  new  challenges.  Processor 
and  network  scheduling  must  be  carried  out  coherently  with  the  support  mechanisms  for  fault-tol¬ 
erance.  In  addition,  these  integrated  mecharusms  must  be  supported  by  all  system  layers.  General¬ 
ized  RMS  already  provides  a  solid  foundation  in  processor  and  bus  scheduling  for  r^-time 
systems.  A  critical  system  issue  is  the  need  for  application-independent  support  at  the  system- 
level  to  build  dependable  real-time  systems.  Such  support  will  greatly  enhance  the  ability  to  toler¬ 
ate  a  wide  range  of  system  faults  (along  the  system  fault  dimension  of  Figure  1)  including  the  fail¬ 
ures  of  processors,  communication  links  and  interfaces,  process  creation,  and  commutucation 
protocols. 

Traditionally,  there  has  been  a  misconception  that  priority-based  scheduling  techniques  cannot 
ensure  determinism  when  redundancy  techniques  are  used.  As  a  result,  most  if  not  all  real-time 
fault-tolerant  systems  use  cyclical  executives  that  employ  lock-step  execution  and  comparison  of 
redundant  components.  However,  the  only  necessary  coneemess  criterion  is  the  need  to  maintain 
I/O  detennitusm  in  redundancy  management  and  the  interface  to  the  external  environment 


GRMS  can  be  used  as  the  basis  to  provide  I/O  determinism  while  still  allowing  different  programs 
to  execute  on  redundant  processors.  However,  the  protocols  to  detect  and  recover  from  faults  on  a 
timely  basis  must  be  developed.  The  critical  factor  for  developing  these  protocols  is  that  a  ^ely 
and  consistent  view  of  the  state  of  the  distributed  system  resources  is  maintained  despite  failures. 

We  are  currently  developing  an  analytical  approach  and  system  primitives  to  provide  system- level 
support  for  tolerating  and/or  recovering  from  processor,  process  and  communication  failures  in 
distributed  real-time  systems.  The  key  element  behind  system-level  fault-tolerance  is  the  real¬ 
time  management  of  spatial  redundancy  to  achieve  dependable  system  operation  even  in  the  pres¬ 
ence  of  resource  failures. 


3.0  Summary  and  Conclusion 

Real-time  computing  and  communication  systems  are  critical  to  an  industrialized  nation’s  techno¬ 
logical  infrastructure.  Modem  telecommunication  systems,  automated  factories,  defense  systems 
and  air-traffic  control  systems  cannot  operate  without  them.  The  key  requirements  for  advanced 
large-scale  real-time  systems  are  predictability,  dependability  and  performance.  A  real-time  sys¬ 
tem’s  timing  behavior  should  be  predictable  before  it  is  developed  or  modified.  It  should  have  the 
ability  to  tolerate  the  failure  of  individual  subsystems  while  providing  a  high  degree  of  perfor¬ 
mance.  Significant  developments  in  these  areas  arc  the  generalized  ratc-monotonic  scheduling 
theory  which  provides  a  theoretical  foundation  for  developing  predictable  real-time  systems, 
membership-based  fault-tolerance  protocols  for  flexible  management  of  redundant  resources, 
wide-area  gigabit  networks,  and  high-performance  RISC  and  parallel  processing  architectures. ,  It 
is  in^rtant  to  build  upon  the  successes  of  GRMS  in  industry,  high-technology  projects  and  com¬ 
mercial  standards. 

We  propose  to  extend  GRMS  in  the  context  of  a  very  large-scale  distributed  computing  system 
where  the  communication  delays  make  it  impossible  for  each  scheduler  to  have  timely  and  com¬ 
plete  system  state  information.  Furthermore,  we  must  create  a  unified  framework  for  high  perfor¬ 
mance  real-time  fault  tolerant  conqiuting.  This  unified  framework  should  provide  an  application 
infrastructure  that  allows  us  to  develop  advanced  large-scalereal-time  systems. 
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As  the  complexity  of  new  applications  of  large  distnbuted  Teal*time  systems  increases,  so  does  the 
need  for  improvement  for  real-time  system  design  and  development  methodology.  The  critical  nature  of 
many  real-time  systems  requires  a  rigorous  design  and  development  of  their  components,  and  validanon 
of  timing  characteristics.  The  traditional  approach  diat  carries  out  the  tasks  of  system  modeling,  tuning 
verification,  and  system  implementation  rather  independently,  seems  inadequate  for  developing  a  large 
distributed  real-time  system  partly  because 

(1)  verification  of  timing-related  properties  has  limitations,  especially  in  distributed/paraliel  environ¬ 
ments, 

(2)  timing  characteristics  are  very  hard  to  determine  in  the  early  stages  of  design. 

(3)  it  may  introduce  inconsistencies  between  the  model  and  the  implemented  system, 

(4)  due  to  high  cost  and  long  development  time,  it  is  often  too  late  when  problems  are  discovered  at 
system  integration  time. 

For  example,  it  is  very  hard  to  determine,  during  the  design  phase,  the  synchronization  and  com¬ 
munication  requirements  among  tasks  that  will  be  distributed  to  several  nodes  in  the  implemented  system. 
However,  the  scheduler  at  each  luxie  should  rely  on  that  information  to  provide  predictable  timing 
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behavior.  Fuithennore,  to  have  a  complete  design,  we  need  to  decide  which  scheduling  algorithms  are  to 
be  used  for  which  resources  at  design  time.  We  can  make  worst  case  assumptions  in  many  cases,  but 
there  should  be  a  facility  to  test  the  impaa  of  the  design  assumptions  on  the  timing  characteristics  of  the 
system.  Making  worst  case  assumptions  for  designing  a  real-time  system  that  can  ensure  conea  opera¬ 
tion  even  in  the  situations  with  TnaximuTn  need  potentially  wastes  large  amounts  of  resources.  In  some 
applications  of  large  distributed  real-time  systems,  die  worst  case  need  may  be  unbounded. 

It  seems  clear  that  we  need  a  new  integrated  approach  to  design,  development,  and  verification  of 
large  distributed  real-tirpe  systems.  It  should  provide  a  facility  to  evaluate  the  riming  constraints  in  the 
early  stage  of  the  system  design,  and  to  monitor  their  impaa  during  system  implementation.  Such  an 
environment  dedicated  to  the  design  and  developmeru  of  real-time  systems  must  support  many  facilities 
that  are  not  present  in  current  programming  environments.  Recently,  there  have  been  attempts  to  provide 
an  integrated  environment  for  real-time  system  design,  development,  and  evaluation  [Son92,  Bar91, 
Ran91.  Jah91]  However,  this  is  the  area  to  which  more  support  for  active  research  and  investigation 
seems  necessary  due  to  its  high  potential  for  significant  benefits  to  overall  system  design  and  develop¬ 
ment 

Given  the  functional  and  timing  specifications  for  a  real-time  system,  the  first  challenge  is  to  vali¬ 
date  that  there  exists  no  inconsistency  in  the  specification.  This  is  a  non-tiivial  tMk  Even  though  there 
has  been  a  considerable  research  effort  in  this  area  of  verifying  the  specification  (c.g.,  RTL  and 
Modechart  from  UT  Austin  [Stu90]),  applicability  of  those  formal  methods  to  practical  problems  has 
several  limitations. 

Assuming  that  the  specification  is  validated  to  be  consistent  and  feasiUe,  we  need  to  come  up  with 
an  initial  system  design.  At  this  stage,  we  only  have  very  rough  idea  about  resource  and  !tynrhTnT^i7arinn 
requirements  of  tasks.  The  integrated  environment  should  provide  a  tool  tiiat  can  help  the  designer  to 
develop  a  top-level  design  from  the  given  spedficatioiL  The  objea-oriented  approach  seems  appropriate 
for  this  type  of  tool,  because  the  external  behavior  of  each  component  (or  objea)  can  be  specified  without 


going  through  the  internal  implementation  details.  Even  though  there  arc  several  tools  and  methods 
developed  for  rcal-dme  system  design  (c.g..  [Fau92]),  their  capabilities  are  not  tested  yet  for  large  and 
complex  real-time  systems. 

The  next  step  is  to  develop  a  prototype  of  the  system  according  to  the  initial  design.  The  prototype 
consists  of  modules  that  represent  system  components  in  the  initial  design.  Modules  for  which  the  imple¬ 
mentation  has  not  been  deteimined  or  for  the  hardware  component  which  is  not  yet  available  can  be  simu¬ 
lated.  The  «tniii?T»'^  part  resourcc/synchronization  requirements  of  the  physical  objea  that  it 

represents.  The  timing  constraints  and  functionalities  of  the  given  specificahon  can  be  tested  using  the 
prototype.  If  the  inidal  design  does  not  satisfy  the  given  specification,  the  design  should  be  refined.  In 
some  cases,  the  initial  design  may  need  to  be  abandoned  and  totally  redesigned.  This  refinemeru  process 
will  continue  until  we  have  a  stablized  system  design  and  prototype  that  satisfies  all  the  requirements.  By 
following  this  iterative  refinement  and  its  prototyping,  we  can  evaluate  the  impacts  of  the  design  choice 
early  in  the  design  stage  and  make  necessary  changes. 

One  of  the  benefits  of  this  integrated  design  approach  is  that  the  designer  can  check  out  whether  the 
design  philosophy  under  which  the  system  is  being  developed  is  appropriate  for  the  current  applicanon. 
For  example,  in  the  early  design  stage,  we  need  to  decide  on  the  philosophy  for  resource  managemeru.  In 
most  real-time  systems,  the  responsibility  of  resource  management  is  typically  shared  by  the  operating 
system  and  the  t^lication,  partly  because  it  is  the  application  that  knows  about  requirements  arid  seman¬ 
tic  information  necessary  to  support  timeiingss  even  in  the  presence  of  overloads  and  faults.  There  is  a 
spectrum  of  design  approaches  to  dividing  responsibilities  between  the  two,  and  the  decision  depends  pri¬ 
marily  on  the  design  philosophy  and  methods  used  to  build  applications  [Nat92].  At  one  extreme,  the 
operating  system  provides  no  special  support,  and  the  total  lespcmsibility  is  on  the  application.  At  the 
other  extreme,  the  operating  system  takes  aU  the  responsibility  for  scheduling  with  no  information  form 
the  application.  These  two  extremes  are  convenient  in  the  sense  that  the  operating  system  and  the  applica¬ 
tion  do  not  need  to  share  application-specific  semantics.  However,  for  the  same  reason,  the  capabilities  of 
those  approaches  are  inherently  limited.  Using  the  integrated  qrproadr,  we  can  test  out  not  only  two 


extremes,  but  also  different  approaches  rather  easily. 

Other  example  to  demonstrate  the  benefits  of  this  approach  is  the  choice  of  scheduling 
algorithms/policies.  Contrasted  with  non-ieal-time  systems  in  which  a  relatively  simple  scheduler 
a  ready  job  non-detenninistically  without  consideiing  timing  requirements,  schedulers  in  real¬ 
time  systems  must  use  a  variety  of  information  and  selection  criteria.  The  many  choices  and  variarions  in 
teims  of  «yht»rfiiiing  policies  makes  it  almost  impossible  to  know  which  choice  would  perfoim  better, 
without  actually  testing  them  against  the  given  requirements.  If  we  know  the  execution  time  and  block¬ 
ing  time  of  each  task,  we  may  be  able  to  perfonn  schedulabiliiy  analysis  using  certain  scheduling 
theories.  However,  those  timing  characteristics  can  be  estimated  only  after  we  determine  the  scheduling 
policies.  This  shows  why  the  integrated  design  approach  combined  witii  prototyping  is  beneficial.  We  can 
plug  in  different  <rh«iuiing  policies  into  the  prototype  and  test  their  timing  behavior. 

Another  important  requirement  for  the  integrated  approach  is  to  provide  modeling  capability  for  not 
only  the  target  system  but  also  the  operating  environmeru.  To  achieve  that,  the  integrated  design  approach 
should  support  the  running  of  the  prototype  under  the  proposed  operating  envirorunenL  Some  of  the  facil¬ 
ities  that  are  necessary  include 

(1)  generate  external  events, 

(2)  change  the  values  of  conditions, 

(3)  update  variables  and  other  data  elements, 

(4)  trigger  state  changes, 

(5)  activate/deactivate  task  activities. 

The  testing/debugging  phase  usually  constitutes  a  large  proponian  of  die  total  system  development 
time.  Due  to  the  critical  role  played  by  large  distributed  real-time  systems,  it  is  almost  always  necessary 
to  enforce  the  highest  level  of  quality  assurance  to  be  employed  for  testing,  ^ith  the  advent  of  more 
ambitious  applications  of  laige  distributed  real-time  systems,  such  as  NASA's  space  station  project,  test¬ 
ing  and  validating  the  quality  of  the  developed  software  becomes  more  costly  and  time  consuming.  Any 
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lymail  reduction  of  the  complexity  of  the  testing  phase,  while  maintaining  the  same  guarantee  of  system 
perfonnance,  may  result  in  a  substantial  benefit  to  the  system  developmeni  efifoit.  The  integrated  design 
and  development  approach  can  substantially  reduce  the  amount  of  work  involved  in  testing  to  ensure  tim¬ 
ing  constraints. 

To  summarize,  the  major  advantages  of  the  integrated  approach  to  design  and  development  of  real¬ 
time  systems  include 

(1)  It  allows  timing  properties  of  a  real-time  system  being  desi gned/developed  to  be  analyzed  in  early 
stages  of  the  system  development  cycle. 

(2)  It  allows  the  fimctional  correctness  to  be  tested,  while  permitting  reduced  e^ort  to  redesign  the 
system  and/or  cor-oonents. 

(3)  It  allows  to  verify  the  assumptions  made  during  the  design  phase. 

(4)  It  encourages  reusability  of  system  components. 
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1  Introduction 

We  concern  with  deadline  marantees  in  distributed  hard  real-time  systems.  In  particular,  we  address 
issues  in  guaranteeing  hard  real-time  message  delivery  in  an  FDDI  (Fiber  Distributed  Data  Interface) 

^  network. 

It  has  become  a  common  practice  to  use  digital  computers  for  embedded  real-time  applications 
such  as  space  vehicle  systems,  image  processing  and  transmission,  and  integration  of  expert  systems 
into  avionics  and  industrial  process  control.  A  salient  feature  of  these  computations  is  that  they  have 
stringent  timing  requirements.  A  timing  failure  could  lead  to  catastrophe.  Further,  these  systems 
are  often  distributed.  This  is  not  only  because  the  applications  themselves  are  often  physically 
distributed,  but  also  due  to  the  potential  that  distributed  systems  have  for  providing  good  reliability, 

^  good  resource  sharing,  and  good  extensibility  [19,  44,  52]. 

The  key  to  success  in  using  a  distributed  system  for  these  applications  is  the  timely  execution  of 
computation  tasks  that  usually  reside  on  different  nodes  and  communicate  with  one  another  to  ac¬ 
complish  a  common  goal.  End-to-end  deadline  guarantees  are  not  possible  without  a  communication 
network  that  supports  the  timely  delivery  of  inter-task  messages.  On  the  other  hand,  despite  efforts 
to  make  the  system  reliable,  faults  may  stUl  occur  due  to  a  severe  working  environment  and  failing 
>  components.  The  main  focus  of  our  work  is  to  address  some  important  issues  related  to  fault-tolerant 

guarantees  of  synchronous  message  deadlines,  i.e.,  no  matter  what  happens  (even  in  the  presence  of 
a  network  fault),  the  messages  wUl  be  transmitted  before  their  deadlines. 

We  have  selected  FDDI  (Fiber  Distributed  Data  Interface)  networks  for  this  study.  FDDI  is 
an  ANSI  standard  for  a  100  Mbits/sec  fiber  optic  token  ring  network  [2,  3].  FDDI  is  suitable  for 
real-time  application  not  only  because  of  its  high  bandwidth  but  also  due  to  its  bounded  access  time 
I  and  its  dual  ring  architecture.  Since  the  early  1980’s,  extensive  research  has  been  done  on  the  FDDI 

networks.  The  FDDI  MAC  protocol  was  first  proposed  by  Grow  [12].  Ross  [34,  35,  36],  Iyer  and  Joshi 
[14,  15]  and  others  j25,  43]  provided  comprehensive  discussions  on  the  timed  token  protocol  and  its 
use  in  the  FDDI.  Many  new  civil  and  military  networks  are  being  developed  based  on  the  skeleton 
of  FDDI.  Examples  include  the  High-Speed  Data  Bus  and  the  High-Speed  Ring  Bus  (HSDB/HSRB) 
^7,  38,  46),  the  Survivable  Adaptable  Fiber  Optic  Embedded  Network  (SAFENET)  [11,  20,  26],  and 
FDDN  (Fiber  Distributed  Data  Network)  [9|.  Many  embedded  real-time  applications  use  FDDI  as 
backbone  networks.  For  example,  FDDI  has  been  selected  as  a  backbone  network  for  NASA’s  Space 
Station  Freedom  [8,  7,  50]. 

Our  work  is  motivated  by  recent  advances  in  the  theory  of  hard  real-time  scheduljng[48,  49].  For 
real-time  systems,  the  basic  design  requirements  for  a  communication  protocol  and  for  a  centralized 
scheduling  algorithm  are  similar:  both  are  constrained  by  time  to  allocate  a  serially  used  resource 
to  a  set  of  processes.  Liu  and  Layland  [23]  addressed  the  issue  of  guaranteeing  the  deadlines  of 
synchronous  (i.e.,  periodic)  tasks  in  a  single  CPU  environment.  They  analyzed  a  fixed  priority 
preemptive  algorithm,  called  the  rote  monotonic  algorithm,  that  assigns  priorities  in  inverse  relation 
to  task’s  periods.  They  showed  that  the  Worst  Case  Achievable  Utilisation  of  the  algorithm  is  69%. 
Provided  that  the  the  utilization  of  the  task  set  is  no  more  than  69%,  the  task  deadlines  are  always 
guaranteed  to  be  satisfied.  The  algorithm  was  also  proven  to  be  optimal  among  all  fixed  priority 
scheduling  algorithms  in  terms  of  adueving  the  highest  worst  case  utilization.  The  rate  monotonic 

‘This  work  i«  (upported  in  pwt  by  gruU  from  the  Air  Force  Office  of  Scientific  Reeenrch,  the  NntioniJ  Science 
Foundation,  the  Office  of  Naval  Reaearch,  the  Reaeaich  Institute  for  Computing  and  Information  Systems  of  the 
University  of  Houston  -  Clear  Lake,  and  Texas  ALM  University. 
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scheduling  algorithm  has  been  subsequently  extended  by  many  researchers  [40],  and  is  used  in  many 
hard  real-time  applications  [10]. 

Intuitively,  one  would  believe  that  a  communication  protocol  that  implements  the  rate  monotonic 
transmission  policy  is  the  most  desirable  for  a  real-time  communication  environment.  However, 
implementation  of  the  rate  monotonic  policy  requires  global  priority  arbitration  every  time  a  node 
in  the  network  is  ready  to  transmit  a  new  message.  FDDI  does  not  support  priority  arbitration  at 
the  medium  success  control  level.  Consequently,  it  is  difficult,  if  not  impossible,  to  implement  the 
rate  monotonic  transmission  policy  in  an  FDDI  network. 

However,  the  methodology  for  analyzing  the  rate  monotonic  algorithm  has  a  more  profound 
significance  than  merely  its  relevance  to  the  fixed  priority  preemptive  algorithms.  The  methodology 
stresses  the  fundamental  requirements  of  predictability  and  of  stability  in  hard  real-time  environments 
and  is  therefore  also  befitting  to  other  hard  real-time  scheduling  problems.  In  this  methodology  the 
Worst  Case  Achievable  Utilization  is  used  as  a  metric  for  evaluating  the  predictability  of  a  scheduling 
algorithm.  As  long  as  the  CPU  utilization  of  all  tasks  is  within  the  bounds  specified  by  the  metric, 
all  tasks  will  meet  their  deadlines.  This  metric  also  gives  a  measure  of  the  stability  of  the  scheduling 
algorithm  in  the  sense  that  the  tasks  can  be  freely  modified  as  long  as  their  total  utilization  is  held 
within  the  limit.  Because  of  this,  we  adopted  the  same  methodology  in  ou'*  study  of  guaranteeing 
message  deadlines  in  FDDI  networks.  We  analyze  the  run-time  control  schemes  of  FDDI  networks 
for  hard  real-time  communication  based  on  the  Worst  Case  Achievable  Utilization. 


2  Network  and  Message  Models 


We  consider  a  network  consisting  of  two  counter-rotating  rings.  Each  ring  consists  of  m  nodes 
connected  by  point-to-point  links  forming  a  circle  i.e.,  the  token  ring.  The  two  rings  will  be  denoted 
ring  A  and  ring  B.  We  denote  the  ring  latency  by  t  which  includes  the  ring  propagation  delav,  the 
node  latency  delay,  the  transmission  delay  of  the  token,  etc.  Thus,  r  is  the  walk  time  of  the  token 
when  none  of  the  nodes  disturo  it.  The  ratio  of  the  ring  latency  t  to  the  target  token  rotation  time 
(TTRT)  is  denoted  by  a.  The  usable  ring  utilization  would  therefore  be  (1  -  o)  [47]. 

A  node  can  connect  either  to  one  of  the  rings  or  to  both.  A  node  can  transmit  and  receive 
niessages  from  a  ring  only  if  the  node  connects  to  it.  For  those  nodes  that  are  connected  to  two 
rings,  we  assume  that  they  have  dual  facilities  for  transmitting  and  receiving  messages  on  both  rings. 
Hence,  the  node  can  simultaneously  transmit/receive  messages  on  both  rings. 

Messages  generated  in  the  system  at  run  time  may  be  classified  as  either  synchronous  messages 
or  asynchronous  messages.  We  assume  that  there  are  (n^)  streams  of  synchronous  messages, 
5i, 52, . . ., S„^(5„g)  in  the  system  which  form  a  synchronous  mess  ’e  set,  Mj^(Mb),  for  ring  A 
(ring  B),  i.e.. 


Af>4  =  {5i,S2,...,S„^}  (1) 

and 


Mb  =  {5i,52,...,S„a).  (2) 

For  the  convenience  of  our  discussion,  we  use  the  notation  M  to  denote  either  Ma  or  Mb-  Similarly, 
n  denotes  either  or  ng. 

Messages  have  the  following  characteristics: 

1.  Synchronous  messages  are  periodic,  i.e.,  messages  in  a  synchronous  message  stream  have  a 
constant  inter-arrival  time.  We  denote  the  period  of  stream  5,  (i  =  1,2,  ...,n)  by  P,. 

2.  The  deadline  of  a  synchronous  message  is  the  end  of  the  period  in  which  it  arrives.  That  is,  if 
a  message  in  stream  5,  arrives  at  time  t,  then  its  deadline  is  at  time  t  +  P,.^ 

^This  aasumptioD  may  be  relaxed. 
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3.  Messages  from  different  streams  are  independent  in  that  message  arrivals  do  not  depend  on  the 
initiation  or  the  completion  of  transmission  requests  for  other  messages. 

4.  The  length  of  each  message  in  stream  Si  is  C,  which  is  the  maximum  amount  of  time  needed 
to  transmit  this  message. 

5.  Asynchronous  messages  are  non-periodic  and  do  not  have  explicit  deadline  requirements. 


The  Utilization  factor  of  a  synchronous  message  set,  U(M)  is  defined  as  the  fraction  of  time  spent 
by  a  ring  in  the  transmission  of  the  synchronous  messages.  That  is, 

.=1 

where  n  is  the  number  of  synchronous  message  steams. 

A  subset  of  messages,  denoted  by  mission  critical.  That  is, 

McQMaOMb.  (4) 

The  objective  of  our  study  is  to  develop  technology  that  guarantees  the  message  deadlines  of  Ma 
and  Mb  under  normal  conditions,  and  guarantees  the  message  deadlines  of  Me  when  a  network  fault 
occurs.  To  facilitate  this  fault  tolerant  guarantee  of  mission  critical  messages,  we  assume  that  nodes 
which  are  required  to  transmit/receive  a  critical  message  are  connected  to  both  rings.  In  this  way, 
once  a  fault  occurs  on  one  ring,  another  ring  can  be  used  to  transmit/receive  critical  messages. 

Without  loss  of  generality  we  assume  that  there  is  one  stream  of  synchronous  messages  on  a  node 
per  ring  (i.e.,  m  =  n).  We  can  fonnaiUy  prove  that  an  arbitrary  token  ring  network  where  a  node 
may  have  zero,  one,  or  more  streams  of  synchronous  messages  to  transmit  can  be  transformed  into 
a  logically  equivalent  network  with  one  stream  of  synchronous  message  per  node. 
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3  Synchronous  Capacity  Allocation 

3.1  Timed  Token  MAC  Protocol 

Guaranteeing  message  deadlines  requires  the  proper  control  of  medium  access.  This  is  the  function 
of  a  medium  access  control  (MAC)  protocol.  FDDI  uses  the  timed  token  MAC  protocol  in  which 
messages  are  segregated  into  separate  classes:  the  synchronow  class  and  the  asynchronous  class 
[12].  Synchronous  messages  arrive  at  the  system  at  regular  intervals  and  may  be  associated  with 
deadline  constraints.  The  idea  behind  the  timed  token  protocol  is  to  control  the  token  rotation 
time.  During  network  initialization,  a  protocol  parameter  called  the  Target  Token  Rotation  Time 
(TTRT)  is  determined  which  indicates  the  expected  token  rotation  time.  Each  station  is  assigned 
a  fraction  of  the  TTRT,  known  as  its  synchronous  capacit}^,  which  is  the  maximum  time  a  station 
is  permitted  to  transmit  synchronous  messages  every  time  it  receives  the  token.  Thus,  once  a  node 
receives  the  token,  it  transmits  its  synchronous  messages,  if  any,  for  a  time  no  more  than  its  allocated 
synchronous  capacity.  It  can  then  transmit  its  asynchronous  messages  only  if  the  time  elapsed  since 
the  previous  token  departure  from  the  same  node  is  less  than  the  value  of  TTRT,  i.e.,  only  if  the 
token  arrives  earlier  than  expected. 

Guaranteeing  a  message  deadline  implies  that  the  message  will  be  transmitted  before  its  deadline. 
With  a  token  passing  protocol,  a  node  can  transmit  messages  only  when  it  captures  the  token. 
Hence,  if  a  message  deadline  is  to  be  guaranteed,  the  token  should  visit  the  node,  where  the  message 
is  waiting,  before  the  expiration  of  the  message  deadline.  That  is,  in  order  to  guarantee  message 
deadlines  in  a  token  ring  network,  it  is  necessa^  to  bound  the  time  between  two  consecutive  visits 
of  the  token  to  a  node  (called  the  token  rotation  time  or  access  time).  The  timed  token  protocol 

’Some  other  •ynonymoui  term*  that  researcher*  nse  are:  Bandwidth  allocation,  Synehronout  allocation,  Synchronoat 
bandwidth  atiignmenti,  and  High  Priority  token  holding  time. 
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possesses  this  property.  In  |18,  39],  Johnson  and  Sevcik  formally  proved  that  when  the  network 
operates  norm^y  (i.e,  there  is  no  failure),  the  token  rotation  time  between  two  consecutive  visits  to 
a  node  is  bound^  oy  twice  the  expected  token  rotation  time  (i.e.,  2  •  TTRT). 

Although  the  prerequisite  of  ‘bounded  token  rotation  time’  is  indispensable,  it  is  however  inad¬ 
equate  for  guaranteeing  message  deadlines.  A  node  with  insufficient  synchronous  capacity  may  be 
unable  to  complete  the  transmission  of  a  synchronous  message  before  its  deadline.  On  the  other 
hand,  allocating  excess  synchronous  capacities  to  the  nodes  could  increase  the  token  rotation  time, 
which  may  also  cause  message  deadlines  to  be  missed.  Thus,  guaranteeing  message  deadlines  is  also 
dependent  upon  the  appropriate  allocation  of  the  synchronous  capacities  to  the  nodes. 


3.2  Allocation  Schemes 
Definition  and  Examples 

Denote  Hi  as  the  synchronous  capacity  of  node  i  for  a  particular  ring.  The  synchronous  message 
parameters  (given  by  the  C/s  and  P,’s)  at  the  various  stations,  the  ^ue  of  TTRT,  and  the  ring 
latency  t  should  be  the  dictating  factors  for  the  allocation  of  the  Hi's.  We  define  a  synchronous 
capacity  allocation  scheme  as  an  algorithm  that,  given  as  input  the  values  of  all  C,  and  P,  in  the 
message  set  and  the  values  of  TTRT  and  r,  will  produce  as  output  the  values  of  the  synchronous 
capacities  Hi  to  be  allocated  to  each  station  t  in  the  network. 

Let  function  /  represent  an  allocation  scheme.  Then, 

. Hn)  =  /(C,,C2,...C„,P,,P2,...P„,rTi2r,T).  (5) 

Some  of  the  allocation  schemes  which  we  consider  are  listed  below: 

•  Full  length  scheme.  In  this  scheme  the  synchronous  capacity  allocated  to  a  node  is  equal  to 
the  total  time  required  for  transmitting  its  synchronous  messages,  i.e., 

H,  =  C..  (6) 

This  scheme  attempts  to  transmit  a  synchronous  message  arriving  at  a  node  in  a  single  turn 
rather  than  splitting  it  into  chunks  and  distributing  its  transmission  evenly  over  its  period  P,. 

•  Proportional  scheme.  In  this  scheme  the  synchronous  capacity  allocated  to  node  :  is  propor¬ 
tional  to  the  ratio  of  C,  and  P,  at  node  i,  i.e., 

H,  =  ^{TTRT  -  T).  (7) 


•  Equal  partition  scheme.  In  this  scheme  the  usable  portion  of  TTRT  is  divided  equally  among 
the  n  nodes  in  allocating  their  synchronous  capacities,  i.e., 

TTRT  -  T 


where  n  is  the  number  of  nodes  in  the  system. 

•  Normalized  propoHional  scheme.  In  this  scheme  the  synchronous  capacity  is  allocated  accord¬ 
ing  to  the  normalized  load  of  the  synchronous  messages  on  a  node,  i.e.. 


H,  =  ^^{TTRT-t), 


where  U  =  Ci/P,. 
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•  Local  scheme.  In  this  scheme,  the  message  len^h  is  divided  by  the  worst  case  number  of 
token  visits  to  a  node  during  a  single  message  period: 


nr. _ ^ _  nn) 

[Pi/TTRTj-l'  ''  ’ 

Note  that  this  scheme  allocates  the  synchronous  capacity  without  nsing  information  regarding 
messages  on  other  nodes.  This  is  ^vantageous  for  mn-time  network  management.  If  the 
parameters  of  a  message  stream  at  a  node  change  during  nm-time,  a  local  allocation  scheme 
need  only  adjust  the  synchronous  capacity  of  the  node  involved.  Other  nodes  are  not  disturbed. 
That  is,  the  entire  network  can  continue  its  normal  operations  while  individual  nodes  change 
their  synchronous  capacities  in  response  to  changing  message  parameters. 


Constraints 

The  synchronous  capacities  allocated  to  the  nodes  by  any  scheme  must  satisfy  two  constraints  in 
order  to  ensure  that  real-time  messages  can  be  transmitted  before  their  deadlines  and  that  the  timed 
token  protocol  requirements  are  satisfied. 

•  Protocol  Constraint:  The  sum  total  of  all  the  synchronous  capacities  allocated  to  all  the  nodes 
in  the  ring  should  not  be  greater  than  the  target  token  rotation  time  minus  the  token  walk 
time,  i.e.. 


YiB,<  TTRT  -  T.  (11) 

<=i 


•  Deadline  Constraint:  The  allocation  of  the  synchronous  capacities  to  the  nodes  should  be  such 
that  the  synchronous  messages  are  always  guaranteed  to  be  transmitted  before  their  deadlines. 
In  other  words,  if  Zj  is  the  minimum  amount  of  time  available  for  node  t  to  transmit  its 
synchronous  messages  in  a  time  interval  (t,t  +  Pi),  then 


*.  >  Ci.  (12) 

Note  that  z,  will  be  a  function  of  Hi  and  the  number  of  token  visits  to  node  i  in  the  time 
interval  (t,t  +  Pi). 


Formally,  we  say  that  a  set  of  synchronous  messages  is  guaranteed  by  an  allocation  scheme  if  both 
the  protocol  and  the  deadline  constraints  are  satisfied.  Once  a  message  set  is  guaranteed,  messages 
will  be  transmitted  before  their  deadbnes  as  long  as  the  network  operates  normally. 


Performance  Metric 

Obviously,  there  are  many  ways  to  construct  synchronous  capacity  allocation  schemes.  We  would  like 
to  classify  and  evaluate  allocation  schemes  so  that  proper  recommendations  can  be  made  to  network 
designers  and  managers  on  what  allocation  schemes  to  use.  An  appropriate  metric  must  first  be 
selected  in  order  to  evaluate  and  compare  the  effects  of  synchronous  capacity  allocation  schemes  on 
the  performance  of  FDDI  networks. 

As  mentioned  earlier,  we  adopt  the  methodology  developed  in  analyzing  the  rate  monotonic 
scheduling  algorithm.  Following  tUs  methodology,  we  use  the  Worst  Case  Achievable  Utilization  as 
the  metric  to  be  used  in  evaluating  and  comparing  the  schemes. 

We  say  Ug  is  an  achievable  utilization  of  scheme  z  if  scheme  z  can  guarantee  all  synchronous 
message  sets  whose  utilization  is  less  than  or  equal  to  Vg.  The  Worst  Case  Achievable  ViUization 
Ug  of  a  scheme  z  is  the  least  upper  bound  of  its  achievable  utilizations  Ug.  That  is,  as  long  as 
the  utilization  factor  of  a  synchronous  message  set  is  not  more  than  Ug,  then  the  message  set  can 
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II  Name  Formula  of  Hi 

W.C.A.V.' 

Full  length 

Hi  =  C. 

0 

Proportional 

=  ^  •  (TTRT  -  r) 

0 

Elqual  partition 

JJ  _  TTPr-r 

m 

Normalized  proportional 

Local 

1—0 

3 

*  W.C.A.V.  M  Ike  ekbmUtiM  •<  *W«nl  Cm*  Achi*mM*  DliliMti*i’. 

*  •  ■  t/TTKT. 


Table  1:  Summary  of  the  Syuchronous  Capacity  Allocation  Schemes. 


be  guaranteed  by  scheme  x.  We  consider  one  scheme  to  be  better  than  another  if  its  Worst  Case 
Achievable  Utilization  is  higher. 

The  main  advantages  of  using  the  Worst  Case  Achievable  Utilization  as  the  performance  metric  ^ 

are  as  follows: 

•  This  metric  evaluates  the  predctability  of  a  hard  real-time  communication  system.  As  long 
as  the  utilization  of  a  synchronous  message  set  is  within  the  bound  specified  by  the  metric,  all 
synchronous  messages  in  the  set  will  meet  their  deadlines. 

•  This  metric  gives  a  measure  of  the  stability  of  the  system  in  the  sense  that  the  parameters  of 
synchronous  messages  can  be  freely  changed  without  affecting  the  dea^ne  guarantees,  provided 
that  the  total  utilization  of  the  message  set  is  held  within  the  limit 

•  In  practice,  using  this  metric  simplifies  the  network  management  considerably  when  configuring 

the  system,  as  it  eliminates  the  problem  of  being  encumbered  with  individual  values  of  syn-  _ 

chronous  and  asynchronous  message  lengths,  inter-arrival  intervals,  phase  differences  between  • 

message  arrivals,  relative  positions  of  the  nodes,  token  position  at  initialization,  etc.  As  long  as 
network  managers  can  ensure  that  the  total  utilization  of  time-critical  synchronous  messages 
is  no  more  than  the  Worst  Case  Achievable  Utilization  of  the  protocol,  they  can  be  assured 
that  the  message  set  will  be  transmitted  with  no  deadlines  being  missed. 

Evaluation  Results  ^ 

We  analyzed  five  synchronous  capacity  allocation  schemes  based  on  their  worst  case  achievable  uti¬ 
lizations.  The  results  are  summarized  in  Table  1. 

Our  analysis  reveals  that  an  improper  allocation  of  the  synchronous  capacities  (such  as  by  the 
full  kngth  scheme  and  the  proportional  scheme)  could  lead  to  a  Worst  Case  Achievable  Utilization  ^ 

of  0%.  That  is,  the  deadline  of  some  message  could  be  missed  even  when  the  synchronous  traffic  is  * 

extremely  low.  Both  the  normalized  proportional  allocation  scheme  and  the  local  allocation  scheme 
on  the  other  hand,  have  a  worst  case  achievable  utilization  of  0.33.  If  the  utilization  of  a  set  of 
synchronous  message  streams  is  less  than  0.33  of  the  usable  network  capacity,  then  the  synchronous 
messa-ges  will  be  guaranteed  by  these  allocation  schemes.  The  remaining  0.67  of  the  usable  network 
capacity  can  be  used  for  the  transmission  of  asynchronous  messages. 
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4  Dealing  with  Link  Faults 

The  results  presented  in  the  last  section  are  based  on  the  assumption  that  there  is  no  network  failure. 
To  provide  deadline  guarantees  in  the  presence  of  a  network  fault,  we  also  have  to  exploit  the  dual 
ring  architecture  and  the  connection  management  mechanism  proposed  in  the  FDDI  standard. 


4.1  Dual  Ring  Architecture  and  Link  Faults 

The  basic  configuration  of  an  FDDI  network  is  a  dual  counter-rotating  ring  as  shown  in  Figure  1.  The 
dual  rings  provide  fault  tolerant  properties  to  FDDI  networks,  since  the  existence  of  a  link  fault  can 
be  sign^ed  on  the  opposing  link.  A  link  fault  is  defined  as  a  fault  that  occurs  on  the  links  between 
nodes  resulting  in  a  lack  of  communication  across  a  single  fiber.  Examples  include  a  single  broken 
fiber,  a  faulty  optical  receiver,  and  a  faulty  optical  transmitter.  Other  faults  (e.g.,  loss  of  power  to  a 
node)  may  be  treated  similarly  to  a  link  fault.  See  (29]  for  a  survey  of  FDDI  fa^t  classification  and 
management. 

In  the  Station  Management  (SMT)  of  FDDI,  there  are  built-in  mechanisms  to  detect  and  to 
recover  from  a  link  fault.  According  to  the  FDDI  standard,  once  a  link  fault  is  detected,  a  sequence 
of  ring  recovery  processes  (i.e.,  the  token  reclaim  process,  beacon  process,  etc.)  will  be  initiated.  If 
the  fault  is  transient  and  hence  recoverable,  the  ring  may  be  functioning  again  after  these  processes. 
If  the  fau?  is  permanent,  two  additional  approaches  are  specified  by  the  FDDI  standard  to  recover 
the  network: 

•  Wmp~\tp:  The  fault  domain  is  traced  and  the  stations  around  the  broken  link  perform  a  wrap- 
up  operation,  i.e.,  two  rings  are  effectively  connected  to  each  other  at  the  stations  immediately 
adjacent  to  the  fault.  This  re-establishes  a  single  ring  between  all  the  nodes  (see  Figure  2). 

•  Global  Hold:  Another  strategy  is  to  prevent  the  wrap-up  of  the  rings  and  hold  the  operational 
ring,  as  it  is,  for  continuing  communication  service.  The  messages  from  the  faulty  ring  can  be 
transferred  to  the  operational  ring  (see  Figure  3). 


4.2  Approaches 

Although  the  connection  management  of  FDDI  guarantees  network  service  before  or  after  a  single  link 
fault  occurs,  it  does  not  support  transmission  of  messages  on  the  faulty  ring  during  fault  detection 
and  recovery.  For  hard  real-time  communication,  this  is  inadequate  because  the  fault  detection 
and  recovery  processes  take  several  seconds  or  more  to  complete.  Message  deadlines  in  many  hard 
real-time  applications  are  usually  of  a  much  smaller  order  of  magnitude. 

Furthermore,  once  a  fault  occurs  on  one  ring,  messages  can  only  be  transmitted  on  another  ring. 
If  both  rings  are  fully  utilized  before  the  fault  occurs,  it  is  impossible  to  transmit  over  a  single  ring  the 
messages  that  were  previously  on  the  two  rings.  Some  of  the  messages  will  have  to  be  dropped.  We 
assume  that  when  a  link  fault  occurs,  the  network  changes  into  a  link  fault  mode.  In  this  mode,  not 
all  the  messages  are  to  be  transmitted.  Only  a  subset  of  messages  that  are  critical  to  the  mission  will 
be  transmitted  and  their  deadlines  have  to  be  guaranteed  at  any  time,  including  during  the  period  of 
fault  detection  and  recovery.  We  assume  that  the  capacity  of  one  ring  is  sufficient  to  transmit  these 
mission  critical  messages.  The  objective  is  to  develop  network  run-time  control  schemes  that  will  be 
able  to  guarantee  the  deadlines  of  critical  messages  through  the  entire  mission  even  in  the  presence 
of  a  link  fault. 

The  following  approaches  have  been  proposed  to  deal  with  this  problem; 

•  Full  Duplication  Method.  Duplicate  the  transmission  of  critical  messages  on  both  rings  so 
that  when  one  ring  is  unavailable,  the  deadlines  of  messages  are  still  naranteed  because  they 
are  also  transmitted  on  the  other  ring.  This  solution  is  the  simplest  but  it  suffers  by  wasting 
bandwidth  during  times  when  there  is  no  fault. 

•  Dynamic  Reallocation  Method.  With  this  approach,  once  a  fault  is  detected,  critical  mes¬ 
sages  from  the  faulty  ring  are  reallocated  to  another  ring  that  is  still  operating.  Although  the 


Fig  1:  Dual  Ring  Architecture  of  FDDI 


Fig  2:  Fault  Recovery  after  Wrapped  Up 


Fig  3:  Fault  Recovery  with  the  Hold  Policy 
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bandwidth  is  fully  utilized  when  there  is  no  fault  in  the  network,  the  implementation  of  this  ap¬ 
proach  requires  a  detailed  analysis  of  timing  factors  in  fault  detection  and  mode  change.  A  time 
efficient  message  reallocation  scheme  and  mode  change  protocol  are  also  needed.  Furthermore, 
this  solution  cannot  be  applied  to  those  applications  where  the  deadlines  of  critical  messages 
would  be  too  small  to  tolerate  the  overheard  of  fault  detection  and  dynamic  reallocation. 

•  Integrated  Method.  An  alternative  is  to  integrate  the  full  duplication  method  and  the  dy¬ 
namic  reallocation  method.  The  transmission  of  critical  messages  with  very  small  deadlines  is 
duplicated  on  both  rings.  A  dynamic  reallocation  will  be  performed  for  other  critical  messages 
when  a  link  fault  occurs  on  one  ring.  This  method  should  utilize  the  network  better  than  the 
full  duplication  method  while  overcoming  the  shortcomings  of  dynamic  reallocation. 

We  are  currently  developing  techniques  for  the  implementation  of  the  above  three  approaches, 
and  to  evaluate  and  compare  the  performance  of  these  approaches.  The  performance  metrics  we  are 
interested  in  include  the  effectiveness  of  network  utilization  in  both  normal  and  faulty  situations,  the 
run-time  overheads,  and  the  domain  of  applicable  applications. 


5  Summary 

We  address  issues  pertaining  to  deadline  guarantees  in  a  degraded  FDDI  network.  We  aimed  at 
providing  deadline  guarantees  to  a  set  of  mission  critical  messages  throughout  the  entire  mission, 
even  in  the  presence  of  a  fault.  This  is  particularly  important  in  practice  because  some  critical 
applications  do  need  non-interrupted  real-time  service. 

Our  approach  is  (upward)  compatible  with  the  proposed  standard.  Hence,  the  results  obtained 
from  our  work  will  be  immediately  applicable  to  the  design  and  analysis  of  distributed  hard  real-time 
systems  where  an  FDDI  network  is  used. 

We  amalyze  the  system  by  deriving  its  worst  case  utilization  bound.  This  metric  is  particularly 
important  because  it  indicates  the  safe^^y  margin  of  the  system  and  provides  a  measure  of  system 
stability.  All  previous  work  regarding  this  measure  is  related  to  the  rate  roonotonic  scheduling 
algorithm.  Our  work  is  the  very  first  which  derives  the  worst  case  utilization  bound  for  a  schedul¬ 
ing  environment  where  global  priority  arbitration  is  not  supported  and  hence  the  rate  monotonic 
algorithm  cannot  be  used. 
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Abstract 

The  essential  thesis  of  this  paper  is  that  the  next  generation  of  industrial- 
grade  process  plant  automation  and  control  systems  requires  a  new  system 
model.  Key  features  of  this  new  model  include  enhanced  real-time  control 
semantics  to  support  the  interactions  of  distributed  fine-  and  coarse-grained 
objects.  The  interaction  model  is  based  on  adaptive  scheduling  policies, 
dynamically  adjustable  system  configurations,  and  new  classes  of  abstract 
real-time  data  types.  The  emergence  of  distributed  computing  machinery  to 
host  objects  based  on  these  semantics  will  allow  the  creation  of  mission- 
critical,  vertically  integrated  industrial  applications  whose  spans  of  control 
cover  a  much  wider  operating  domain  of  the  process  plant.  The  new  system 
model  will  provide  the  basis  for  a  plant  control  system  (PCS)  environment 
that  can  subsume  the  duties  of  today's  more  limited  and  proprietary 
regulatory  distributed  control  systems  (DCS)  while  providing  the  platform 
for  a  new  generation  of  advanced  plant  controls. 

Key  Words:  Continuous  process  plant  control;  distributed  control  system 
(DCS):  plant  control  system  (PCS);  objects;  threads;  adaptive 
scheduling;  dynamic  configuration;  real-time  control. 

1 .  Introduction 

The  domain  of  commercial  industrial  process  control  is  served  today  by  automation  systems 
that  have  been  optimized  for  linear,  deterministic,  sampled-data  regulatory  control 
problems.  In  the  main,  these  systems  utilize  distributed  microprocessor-based  elements 
interconnected  by  various  proprietary  communications  structures  to  implement  classical 
analog  regulatory  loop  control  policies  and  mechanisms.  For  a  given  plant  control 
application  (e.g.,  industrial  steam  production)  control  policies  are  typically  expressed  by 
engineers  in  the  semantics  of  feed  forward  and  feedback  control  elements  (e.g.,  the 
proportional  integrating  and  differentiating,  or  PID,  controller).  These  elements  are 
driven  by  discrete-time,  quantized  measurements  comprising  small  vectors  of  integer  and 
real  number  values  (e.g.,  pressure,  temperature,  pH,  flow  rate).  Output  from  the  control 
logic  elements  effect  process  state  through  actuation  of  final  control  devices  (e.g.,  valve 
positioners,  motor  controls  and  electrical  switch  gear).  This  input-control-output  relation 
defines  a  “control  loop"  and  is  the  architectural  basis  for  the  current  generation  of 
instrumentation  and  control  products. 

More  advanced  regulatory  control  systems  add  to  the  basic  mechanisms  facilities  for  multi¬ 
loop  control  policies  predicated  on  process  identification,  optimal  estimators,  optimal 
controllers,  and  high-fidelity  process  simulations.  Some  even  support  embedded  advanced 
control  primitives  such  as  Smith  Predictors,  multi-variable  and  adaptive  (e.g.,  self¬ 
tuning)  controllers,  and  batch  process  controllers.  These  more  advanced  features  support 
the  creation  of  control  policies  appropriate  for  the  management  of  more  complex  processes 
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whose  character  may  be  less  deterministic,  stochastically  driven,  not  directly  observable, 
or  only  partially  controllable.  Services  provided  by  these  advanced  functions  generally 
define  the  base  of  what  is  considered  the  supervisory  control  domain. 

Contemporary  regulatory  control  systems  have  grown  in  both  design  and  application  from 
the  bottom  up,  having  been  derived  from  earlier  electro-mechanical,  pneumatic,  and  analog 
control  precursors  dating  back  to  the  industrial  revolution.  They  are  typically  specified  and 
implemented  by  conservative  operating  plant  personnel  whose  primary  interests  are  the 
production  of  the  product(s)  the  plant  exists  to  manufacture.  The  automation  systems  have 
historically  been  seen  as  stand-alone  systems  whose  inputs  and  outputs  are  distinct  from 
those  of  other  plant  processes. 

For  example,  it  is  not  uncommon  to  find  in  a  pulp  and  paper  mill  [Smoot89]  three  different 
control  systems,  one  responsible  for  the  power  house  (electricity  and  steam  processes),  one 
controlling  the  pulp  mill  (fiber  and  effluent  process),  and  one  controlling  each  paper 
machine  (paper  production  processes)  in  the  mill.  These  various  control  systems  were 
probably  purchased  at  different  times,  from  different  vendors,  at  different  points  in  the 
evolution  of  distributed  control  technology,  by  different  plant  management  personnel,  for 
different  economic  reasons,  and  without  the  benefit  of  an  overall  plant  integration  and 
automation  policy.  Rationalization,  integration,  training,  maintenance,  and  inter-operation 
are  generally  only  afterthoughts,  and  today  represent  significant  elements  in  the  total 
operating  cost  equation  for  the  plant. 

At  the  same  time  individual  control  systems  were  being  implemented  at  the  operating  plant 
level,  new  levels  of  automation  addressing  a  different  class  of  problems  (with  completely 
different  semantics)  were  being  applied  to  enterprise  business  systems.  Business  systems 
are  typically  not  viewed  as  control  systems  per  se.  They  are  generally  sponsored  by 
corporate  finance  and  MIS  organizations  whose  problem  domains  are  semantically  different 
from  that  of  plant  operations,  and  whose  policies  and  mechanisms  have  grown  from  a 
distinctly  unique  tradition. 

During  the  period  from  1 960  to  1 980,  while  much  of  today’s  plant  and  business  automation 
was  being  installed,  a  great  deal  of  work  was  done  to  bring  these  two  disciplines  together. 
Most  of  this  work  was  academic  with  its  foundation  based  on  principles  of  operations 
research,  econometrics,  cybernetics,  and  the  modeling  and  simulation  of  large-scale 
dynamic  systems.  During  the  last  two  decades,  computer  science  has  given  us  a  rich  set  of 
domain  neutral  semantics  within  which  to  express  control  and  automation  problems, 
solutions,  policies,  and  mechanisms  that  are  applicable  to  the  traditional  regulatory  and 
business  domains.  During  the  same  period  global  commercial  pressures  have  made 
production  efficiencies,  product  quality,  and  environmental  and  resource  management  issues 
critically  important  business  policy  and  capital  investment  drivers. 

These  factors  have  each  led  to  increasingly  richer  requirements  to  interconnect  plant 
process  control  systems  with  operational  business  systems  to  facilitate  such  applications  as 
optimal  plant  production  scheduling,  raw  material  resource  planning,  and  compliance  to 
regulatory  agency  tracking  and  reporting  requirements.  For  the  last  decade  these 
applications  requirements  have  led  increasingly  to  connectivity  and  inter-operability 
requirements  that  have  helped  create  an  entire  industry  based  on  the  professional  practice 
of  systems  integration.  They  have  also  given  impetus  to  an  entire  spectrum  of  industrial, 
national,  and  international  open  systems  standards  movements. 

It  is  our  thesis  that  during  the  next  decade  world-wide  commercial  forces  will  justify  the 
fusion  of  business  and  regulatory  control  policies.  This  will  result  in  the  rationalization  of 
intra-  and  inter-plant  operating  policies,  resulting  in  the  establishment  of  consistent 


NATO  Advance  Study  Institute  on  Real-Time  Control 


Page  2 


168 


operating  and  control  semantics.  This  unified  set  of  operating  and  control  semantics  will 
foster  reusable  applications,  encourage  integration,  and  lower  the  overall  installed  and 
operating  costs  per  automation  function.  The  science  of  distributed  information  systems  is 
sufficiently  rich  today  (with  a  few  important  exceptions)  to  provide  the  essential  computing 
and  communication-  fabric  on  which  ScmanLically  consistent  mechanisms  can  execute.  We 
will  explore  this  issue  in  the  following  pages. 

2.  Contemporary  Industrial  Process  Controls 

Contemporary  industrial  digital  control  systems  (DCS’s)  provide  direct  digital  data 
acquisition  and  control  of  industrial  continuous,  batch  and  discrete  manufacturing  processes. 
For  a  number  of  logical  and  historical  reasons,  these  control  systems  fall  within  an 
automation  hierarchy.  The  figure  below  depicts  this  logical  hierarchy  of  the  automation 
platform.  The  levels  imply  the  span  of  influence  of  the  control  applications  that  populate 
physical  components  that  comprise  the  total  plant  automation  system.  The  five  level 
hierarchy  combines  control  elements  that  span  field  instrumentation  elements  at  the  base, 
to  inter-plant  control  elements  at  the  top. 
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Level  0  LO  defines  a  domain  that  encapsulates  applications  (e.g.,  transducer  management) 
responsible  for  field  process  measurement,  actuation  or  analysis.  LO  objects  are 
end-systems  on  field  communications  links  responsible  for  either  input  to,  or 
output  from,  a  LI  regulatory  or  cell  control.  Therefore,  LO  elements  are 
generally  grouped  and  associated  with  specific  Level  1  control  policies. 

Level  1  LI  objects  implement  policies  governing  data  acquisition,  filtering,  and 
regulatory  or  sequence  control  functions.  LI  objects  are  responsible  for  basic 
manufacturing  cell-level  (inter-transducer)  direct  digital  control, 
implementing  the  automation  responsible  for  primary  physical  process 
supervision  and  safety-related  process  management.  LI  has  its  roots  in  electro¬ 
mechanical,  pneumatic  and  analog  controls  which  provides  the  semantic 
framework  for  control  policies  and  mechanisms  found  at  this  level.  This  level 
exhibits  the  most  stringent  real-time  and  fault-tolerant  requirements  within 
the  hierarchy. 

Level  2  L2  defines  the  supervisory  or  area  control  domain  responsible  for  basic  area 
(inter-cell)  production  control.  L2  is  generally  associated  with  plant  control 
rooms  where  many  regulatory  loops  are  consolidated  into  higher  level  process 
control.  L2  objects  provided  mechanisms  which  implement  policies  governing 
operator  interfaces,  process  data  archives,  trend  analysis,  alarm  management. 
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diagnostics,  plant  area  start-up  and  shutdown,  and  LI  configuration  and  set- 
point  control.  This  level  generally  defines  that  lowest  level  for  horizontal 
(wide-span)  control  policies. 

Level  3  L3  represents  the  intra-plant  (inter-area)  control  domain  where  policies 
governing  plantwide  coordination,  cooperation  and  control  are  implemented.  L3 
objects  provide  mechanisms  that  govern  plant  production  scheduling,  energy  and 
raw  material  utilization,  inventory  and  work-in-process,  product  quality,  and 
maintenance  management  policies.  This  level  provides  the  intra-plant  control 
room  domain,  where  plant  management  personnel  can  interact  with  the  operating 
conditions  of  the  overall  plant.  This  is  a  site-specific  wide-span  control  domain. 

Level  4  L4  is  the  inter-plant  (intra-enterprise)  control  domain  providing  objects 
responsible  for  enforcing  global  product  production  coordination,  supervision 
and  control.  This  domain  is  relevant  to  enterprises  operating  multiple  plants 
with  common  or  shared  production  facilities.  For  example,  two  chemical  plants 
responsible  for  manufacturing  a  specific  polymer  may  be  coordinated  to  meet 
volume  commitments  under  the  uncertainties  of  maintenance,  labor, 
transportation,  and  raw  material  availability. 

From  an  implementation  perspective,  existing  DCS  designs  are  focused  on  two  principle 
areas:  1/0  front-ends  and  LI  controllers.  This  has  focused  attention  on  the  interconnection 
networks  running  between  these  two  elements,  and  competitive  systems  today  have  taken 
different  approaches  to  implementing  their  respective  control  networks  (aka,  'data 
highways.")  The  designs  are  optimized  for  low  cost  per  I/O  point  (analog  and  digital), 
1 0ms  control  loop  response  times,  high  availability,  and  low  MTTR. 

User  interfaces  are  today,  for  the  most  part,  based  on  proprietary  graphical  presentation 
systems.  Many  are  hosted  on  general  purpose  workstation-class  machines  from  HP,  DEC, 
and  Sun.  The  principle  application  engineering  tools  used  are  graphical  control  block 
programming  environments  that  allow  the  process  control  engineer  to  cut-and-paste 
control  software  objects  (e.g.,  a  function  generator  or  PID  block)  into  a  design.  These 
graphical  descriptions,  each  of  which  typically  define  a  loop  control  policy,  are  then 
compiled  into  executable  “segments"  that  are  loaded  into  one  or  more  of  the  LI  controllers. 

LI  controllers  are  specially  designed  single  board  computers  that  are  optimized  for 
reliability,  environmental  hardness,  n:1  or  1:1  redundancy,  high-speed  communications 
with  peer  LI  machines,  hot  insertion  into  their  backplanes,  and  multiplexed 
communications  with  the  LO  input/output  subsystems.  LI  and  LO  devices  generally  operate 
in  a  master-slave  relation,  with  a  single  LI  master  responsible  for  up  to  32  LO  I/O  slaves. 

For  the  purposes  of  this  paper,  the  salient  feature  of  LI  (and  some  LO)  elements  is  the 
manner  in  which  loop  segments  (and  complex  measurement,  actuation  and  analysis)  code  is 
scheduled  and  executed  in  “real-time."  Contemporary  DCS  implementations  have  taken  a 
very  conservative  approach,  based  on  best  practices  of  the  late  1 970’s  and  early  1 980's 
when  these  systems  were  architected.  There  were  a  number  of  silicon-based  executives 
available  such  as  VRTX,  MTOS,  and  pSOS  that  provided  the  basic  multi-tasking  kernel 
primitives  required  for  interrupt-driven,  priority-based  scheduling.  These  kernels  were 
optimized  (for  the  time)  for  low  task  switching  overhead,  priority-based  queuing  and  task 
dispatching,  and  primitive  interprocess  signaling.  To  simplify  the  designs,  and  to  keep 
context-switch  latencies  to  a  minimum,  DCS  vendors  resorted  to  the  simplest  of  all  task 
scheduling  policies  --  fixed  priority  within  fixed  time  cycles. 
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The  various  loop  control  mechanisms,  as  expressed  in  sequences  of  compiled  and  linked 
control  block  segments,  are  deposited  by  a  systems  engineering  development  tool,  typically 
hosted  on  a  PC-  or  workstation-class  machine  into  the  address  spaces  of  LI  machines.  The 
loading  policy  is  based  on  a  priori  knowledge  of  typical  execution  profiles  of  various  control 
blocks.  The  configuration  tools  estimate  the  load  a  given  loop  control  policy  will  likely  place 
on  the  LI  machine,  its  periodic  execution  requirements,  and  its  estimated  duty  cycle,  or 
completion  time.  On  the  basis  of  on  these  factors  (and  a  few  heuristics-based  magic 
numbers  thrown  in  for  good  measure)  a  given  LI  machine  is  assigned  its  task  set. 

Task  sets  and  their  interaction  profiles  are  rarely  understood  well  enough  at  design  time  to 
guarantee  their  correct  temporal  behavior.  To  compensate  for  this,  LI  machines  are 
typically  under  utilized  in  terms  of  processor  cycles.  Furthermore,  a  great  deal  of 
verification  testing  and  system  tuning  is  performed  during  the  factory  acceptance  and  on¬ 
site  system  commissioning  phases  of  a  project.  This  multi-step  configuration  process  is 
required  for  establishing  confidence  that  the  system  logic  and  its  implementation  are,  in 
some  fashion,  correct.  This  process  is  not  only  time  consuming  and  cumbersome,  but 
represents  a  significant  cost,  both  before  commissioning  and  afterwards  during 
maintenance,  upgrades,  and  redesign  caused  by  process  changes  within  the  plant. 

This  effort  at  scheduling  LI  task  sets  is  at  the  core  of  the  DCS  configuration  problem.  It  does 
not  address  the  scheduling  of  L2  or  L3  or  L4  tasks,  nor  does  it  manage  the  interdependencies 
among  tasks  at  the  various  levels  in  the  hierarchy.  Therefore,  it  can  be  said  that  although 
LO  and  LI  are  truly  “hard"  real-time,  as  defined  by  [Cheng88]  and  others,  L2-L4  are  only 
“soft"  real-time.  Therefore,  applications  that  are  vertical  in  nature  (i.e.,  engage  the 
resources  of  L0-L4  machines  on  behalf  of  some  computation)  are  only  soft  real-time,  at 
best.  We  require  the  next  generation  of  industrial  control  systems  to  be  “vertically  hard" 
real-time,  as  opposed  to  today’s  machines  that  are  “horizontally  hard,"  and  only  so  at  the 
lowest  levels. 

Because  current  DCS  configurations  are  optimized  for  (and  typically  procured  and  applied 
to)  LI  and  L2  applications,  their  resources  are  highly  utilized  at  LO  and  LI ,  but  often  under 
utilized  at  L2.  There  are  a  number  of  reasons  for  this  disparity.  They  are  primarily 
historical,  but  certainly  the  proprietary  nature  of  contemporary  DCS  implementations 
makes  it  difficult  and  expensive  to  realize  integrated  vertical  solutions  to  advanced  plant 
control  problems.  This  situation  has  lead  to  the  development  of  “middleware"  companies 
such  as  Oil  Systems  and  Setpoint  that  have  successfully  developed  and  deployed  a  limited 
number  of  control-oriented  applications  on  standard  general  purpose  computers  that  bridge 
the  gap  between  “business  systems"  at  the  top  of  the  hierarchy  and  the  “control  systems* 
(DCS’s  and  PLC’s)  at  the  bottom.  These  middleware  services  tend  to  support  market-  and 
process-specific  applications  requiring  real-time  archival  storage,  analysis,  and 
presentation  functions. 

The  historical  reasons  for  this  control  domain  isolation  and  under-utilization  are  rooted  in 
the  business  practices  of  both  vendors  and  end-users  of  regulatory  control  systems.  It  is 
difficult  for  vendors  to  think  outside  their  historical  context  and  base  market  applications. 
For  example,  Bailey  is  rooted  in  the  electric  utility  and  industrial  steam  markets, 
Honeywell  is  rooted  in  the  petroleum  refining  market,  and  Allen-Bradley  is  rooted  in  the 
electrical  switch  gear  commodity  market.  Furthermore,  the  problems  of  control  at  LI  are 
complex,  and  industry-specific  process  knowledge  is  an  asset  that  must  be  developed  and 
nurtured  over  time. 

On  the  end-user  side  of  the  equation,  there  are  a  number  of  impediments  to  developing 
vertically  integrated  control  policies  and  mechanisms.  First,  L3  and  L4  are  not  well 
understood  as  control  domains.  They  are  still  today  referred  to  in  MIS,  finance,  or 
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manufacturing  terms.  Practitioners  at  these  levels  do  not  think  in  terms  of  real-time 
control.  Their  semantic  frames  of  refererKe  are  rooted  in  transaction  processing,  database 
management,  COBOL,  MRP,  and  Lotus  1 23.  To  make  matters  worse,  sponsors  of  L3  and  L4 
automation  initiatives  generally  do  not  define  their  problem  domain  in  terms  relevant  to 
process  plant  operations  personnel. 

The  semantic  gap  between  vendor  and  user,  and  among  user  communities,  will  likely  persist 
for  some  time.  It  is  a  condition  of  current  business  practices  and  tradition.  The  next 
generation  of  plant  control  systems  need  not  be  so  constrained.  There  are  cogent  reasons  to 
believe  that  providing  control  platforms  arxi  associated  services  designed  to  support  the 
construction  of  vertically  integrated  applications  will  be  a  primary  driver  for  resolving 
(dissolving)  this  disparity. 

2.1  Contemporary  Platforms 

Although  there  are  a  number  of  important  distinctions  among  the  competing  distributed 
control  systems  in  the  market  today,  they  are  all  much  more  similar  than  they  are  different. 
One  set  of  metrics  relates  to  the  size  and  complexity  of  the  control  tasks  the  systems  are 
commissioned  to  manage.  Typical  control  system  projects  can  be  measured  in  terms  of  I/O 
counts,  control  loops,  and  system  database  elements  (aka,  “tags.") 
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The  figure  above  provides  a  simplified  model  of  a  node  in  a  contemporary  process  control 
system.  The  graphic  depicts  a  four  loop  controller  and  its  associated  local  database.  A 
“control  loop"  is  defined  as  an  input/output  pair  wrapped  around  some  control  algorithm. 
In  many  supervisory  systems,  the  primary  function  is  data  acquisition  (SCADA)  for  which 
this  I/O  pair-per-loop  rule  is  violated.  Such  systems  may  have  1 0-20  inputs  for  a  single 
output.  Inputs  comprise  analog  and  digital  measurements,  while  outputs  are  either  analog  or 
digital  control  signals.  During  the  course  of  computing  (estimates  oO  process  state  changes 
and  requisite  control  policies,  primary  and  secondary  derived  quantities  and  intermediate 
calculations  are  made.  All  of  these  elements  combine  to  define  the  contents  of  the  “tag 
database"  associated  with  a  given  controller. 

A  large  system  configuration  might,  for  example,  contain  a  distributed  database  of  30,000 
tags  derived  from  10,000  LO  1/0  points,  of  which  1000  pairs  are  associated  with  1000  LI 
regulatory  control  loops.  There  may  be  another  7000  independent  LO  process  variable 
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measurements  that  provide  input  to  1 00  L2  supervisory  controls  responsible  for  producing 
900  outputs. 

From  these  points  and  LI  loops  and  L2  controls  are  derived  the  1 2,000  primary  process 
meta-variables  (e.g.,  averages,  process  state  estimates,  etc.)  and  another  8,000  secondary 
metrics  (e.g.,  trends,  correlations,  etc.)  The  1000  LI  loops  might  be  implemented  in  50 
controllers  (hardened,  single-board  computers)  hosting  an  average  of  20  control  loops 
each,  some  of  which  are  dual-redundant  for  safety  reasons.  The  1 00  L2  controls  may  be 
implemented  in  1 0  L2  controllers,  some  of  which  are  redundant.  These  60-t-  controllers 
would  be  connected  to  their  respective  input-output  channels  through  collocated  I/O 
subsystems.  Each  would  maintain  the  real-time  status  of  their  tags.  These  tags  would  be 
available  throughout  the  control  system  by  virtue  of  the  system’s  distributed  database  and 
communications  services. 

Systems  of  this  size  are  rare  today  (<10%  of  the  total),  but  occur  frequently  enough  that 
system  designs  must  take  them  into  account.  Many  of  today’s  DCS  vendors  sell  systems  that 
cannot  be  easily  scaled  either  up  to  this  size,  or  if  designed  for  this  size,  cannot  be  scaled 
down  to  small  economically  viable  systems.  Scalability  is  a  critical  design  requirement  for 
the  next  generation  of  commercial  control  systems. 

The  figure  below  depicts  a  platform  configuration  that  is  typical  of  today’s  plant  automation 
systems.  LO  field  devices  are  connected  through  some  form  of  a  field  bus  to  LI  local  or 
remote  I/O  processors  or  integrated  controllers.  LI  devices  are  interconnected  via  control 
network  segments  to  L2  area  controllers  that  are,  in  turn,  connected  through  a  backbone 
network  to  general  purpose  L3  computing  devices  (e.g.,  via  Ethernet-based  LANs).  These  L3 
devices  are  then  used  to  interconnect  to  enterprise-wide  business  systems  (e.g,,  through 
IBM  SNA-based  networks). 

This  general  configuration  associates  control  domains  with  communications  sub  networks, 
since  the  physical  geometry  of  the  plant  most  often  defines  both  the  physical  and  logical 
partitioning  of  the  process  control  problem.  In  the  next  generation  of  control  systems  there 
are  clear  cost  and  performance  incentives  to  collapse  the  three  physical  networks  (Level  1  - 
3  in  the  figure  above)  into  a  single  high-performance  structure.  There  will,  however, 
remain  practical  reasons  to  partition  the  control  problem  into  a  logical  hierarchy  that  is 
appropriately  mapped  onto  the  flattened  physical  structure. 

The  figure  shows  a  number  of  different  processing  nodes  representing  “control  domain 
hosts,"  or  servers.  In  this  simplified  example,  control  servers  (CS)  interact  with 
transducers  directly  interfaced  to  plant  processes.  Applications  servers  (AS)  interact  with 
the  control  services  to  produce  derived  process  state  information.  Display  servers  (DS) 
provide  the  domain  specific  operator  interfaces  which  can  be  shared  across  the  system  at 
end-user  devices  (in  the  sense  of  X-windows).  Bridge  servers  (BS)  isolate  control  domains 
and  route,  through  flow  control  protocols,  high  priority  safety-related  process 
synchronization  messages.  Gateway  servers  (GS)  provide  for  interconnecting  subsystems 
into  heterogeneous  internetworks. 

As  logical  as  this  picture  is,  contemporary  systems  have  not  solved  in  a  unified  manner  a 
number  of  key  problems,  including  issues  related  to  L1-L2-L3  interoperation,  CS-AS 
cooperation  at  a  given  level  or  across  levels,  GS  functionality  as  it  pertains  to  access  control 
and  accounting  in  an  internetwork,  or  the  style  and  function  of  shared  displays  (i.e.,  the 
“single  window"  operator  console).  These  problems  are  difficult  (if  not  intractable)  given 
the  heterogeneous,  loosely  coupled,  and  proprietary  nature  of  current  control  system 
designs.  The  situation  is  exacerbated  by  the  conservative  policies  governing  control  of 
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processes  with  safety-related  side  effects  and  the  nature  of  capital  project  justification  and 
implementation  cycles  within  end-user  markets. 

Even  without  a  “unification  principle,*  the  process  industries  have  succeeded  in  building 
systems  in  an  ad  hoc  fashion  out  of  multi-vendor  components  with  reliance  on  in-house  and 
third-party  systems  integration  services.  The  predominant  trend  is  to  utilize  combinations 
of  PC-,  workstation-,  and  VAX-class  machines  interconnected  by  LAN's  and  LAN-servers 
for  the  L3  and  L4  application  platforms.  These  are  in  turn  attached  through  non-standard 
means  to  more  homogeneous  LI  -L2  DCS  or  PLC  environments.  These  configuratior^  tend  to 
be  relatively  inexpensive  from  a  hardware  view  point.  However,  the  less  tangible  costs 
associated  with  integration  services,  licensed  software,  system  maintenance  and 
development,  documentation,  and  training  tend  to  be  very  high  when  compared  with  more 
homogeneous  solutions.  The  goal  of  next  generation  plant  control  systems  is  to  provide 
greater  benefits  and  lower  total  costs  through  rationalizing  these  ad  hoc  control  platforms. 

3.  Next  Generation  Industrial  Plant  Controls 

The  next  generation  of  industrial  plant  control  systems  will  be  architected  to  provide  new 
capabilities  while  at  the  same  time  addressing  the  deficiencies  of  the  current  generation 
DCS+LAN  products.  New  capabilities  will  likely  include  integrated  vertical  applications, 
wider  (aka,  plantwide)  spans  of  control,  greater  process  fidelity,  improved  availability  of 
the  whole  ensemble,  lower  total  costs  per  control  function,  and  backward  compatibility  to 
legacy  systems. 
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The  half-life  of  any  new  control  system  is  estimated  at  1 0  years,  requiring  its  design  basis 
to  support  an  evolutionary  and  adaptable  implementation  path.  The  design  of  a  new  contro 
machine  is  predicated  on  a  number  of  base  hardware  technology  enablers,  certain  system  ano 
applications  software  paradigms,  and  competitive  and  technical  pressures.  For  the  purposes 
of  this  paper,  we  will  not  consider  market  or  financial  drivers,  although  they  are  in  many 
respects  more  important  than  technical  issues. 

3.1  Hardware  Technology  Drivers 

There  are  a  number  of  important  hardware  trends  that  must  be  considered.  Circuit  densities 
are  increasing  at  about  25%  per  year,  doubling  every  three  years  [HennessySO].  Device 
speeds  are  increasing  at  a  similar  rate.  This  is  equivalent  to  realizing  the  same  device 
functionality  in  half  the  space  at  twice  the  speed  every  three  years.  As  a  related 
development,  the  cost  per  processor  instruction  cycle  is  declining  at  25%  per  year.  This 
yields  100%  additional  processing  capacity  (operating  at  twice  the  speed  in  half  the  space) 
for  the  same  cost  every  three  years.  The  basis  today  is  25  MHz  machines.  By  the  mid-life 
of  a  new  system  we  will  be  able  to  use  200  MHz  processors  in  the  same  physical  space  and  at 
the  same  prices  as  today's  machines. 

The  cost  of  memory  is  declining  at  1 5%  per  year,  dropping  by  a  factor  of  twc  every  five 
years.  DRAM  densities  are  increasing  at  about  60%  per  year,  quadrupling  every  three 
years.  Therefore,  in  the  span  of  just  1 0  years  we  should  see  twelve  times  the  memory 
density  at  one  quarter  the  cost.  At  the  same  time,  application  address  space  is  being 
consumed  at  one  additional  address  bit  per  year,  on  average,  suggesting  we  need  an  additional 
1 0  bits  of  address  over  the  design  half-life  of  a  new  machine.  In  today’s  control  systems  we 
use  about  17  bits  of  address  space  per  LO  device,  21  bits  per  LI  device,  23  bits  per  L2 
device,  and  24  bits  per  L3  device.  By  the  year  2005  we  estimate  that  LO  devices  will 
utilize  26  address  bits,  with  32  bits  at  LI,  34  bits  at  L2,  and  36  bits  at  L3.  Clearly,  64- 
bit  processors  are  required  to  implement  the  upper  domains  of  the  next  generation  of 
machines. 

Disk  density  is  increasing  about  25%  per  year,  doubling  in  three  years  [HennessyBO].  This 
keeps  pace  with  the  consumption  of  DRAM,  and  suggests  that  over  the  life  of  the  system 
secondary  storage  demands  will  increase  for  two  principle  reasons.  First,  backing  storage 
is  required  to  contain  (at  least  part  of)  the  static  images  of  the  LI  -L4  machinery.  Second, 
significant  archival  storage  is  required  to  log  the  operating  history  of  the  plant.  For 
example,  a  plant  with  1 ,000  field  measurements  sampled  at  1  Hz  would  produce  a  raw  LO 
data  rate  of  64  Kbps,  assuming  64  bits  per  point  (data,  plus  status,  plus  time  stamp).  That 
represents  a  potential  uncompressed  LO  storage  requirement  of  over  2  Terabits  per  year,  or 
250  Mbytes  per  point  per  year.  Assuming  an  average  compression  factor  of  .6,  we  can 
estimate  an  appetite  of  1 50  Mbytes  per  point  per  year  of  required  archival  storage  capacity. 

Available  communications  bandwidth  is  increasing  by  a  factor  of  1 0  every  three  years.  Its 
basis  today  is  10  Mbps,  yielding  100  Mbps  by  1995,  and  10  Gbps  by  2005.  This  bandwidth 
is  expected  to  be  absorbed  for  a  number  of  reasons,  primarily  at  automation  levels  L2  and 
L3,  including:  i)  the  routine  use  of  multimedia  man-machine  interfaces  that  support 
integrated  voice  and  full-frame  video  display  systems;  and  ii)  the  increasing  utilization  of 
optical  sensors.  These  sensors  have  application  in  many  control  domains,  but  when  used  for 
high  speed  flat  sheet  production  (such  as  steel,  film  and  paper  making)  can  produce 
enormous  volumes  of  data  in  very  short  periods. 

This  brief  summary  suggests  that  by  the  end  of  the  design  half-life  of  the  next  generation  of 
plant  control  systems  (circa  2005)  the  computational  nodes  of  the  system  will  routinely  be 
operating  at  200  MHz,  supporting  an  address  space  of  30-40  bits,  intercommunicating  at 
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1 0  Gigabits  per  second  over  an  optical  mesh,  collectively  tracking  and  controlling  an 
evolving  plant  state  comprising  over  10®  objects,  and  utilizing  Terabyte  backing  storage 
subsystems.  This  scenario  points  to  the  real  design  problem  —  software  --  its  creation, 
configuration,  deployment  and  maintenance. 

3.2  Software  Paradigm  Shifts 

Building  high  capacity  real-time  distributed  computing  systems  has  been  motivated  in  a 
number  of  applications  domains,  including  military  command  and  control,  industrial 
process  control,  public  transportation,  and  telecommunications  applications.  These  domains 
are  closely  related  in  terms  of  growth  in  demand  for  real-time  distributed  hardware  and 
software  based  functionality.  This  demand  in  the  industrial  automation  market  segment  is 
characterized  in  the  figure  below. 

In  FY92  it  is  estimated  that  45%  of  revenue  derives  from  good  old  fashion  LO-Ll  regulatory 
controls.  25%  derives  from  more  advanced  L2  supervisory  controls;  20%  derives  from 
professional  engineering  services  across  the  domains,  but  not  including  custom  applications 
software  development  and  systems  integration;  and  1 0%  derives  from  L3  plantwide  control 
functionality.  Over  the  four  year  period  ending  in  FY96  it  is  estimated  that  the  process 
automation  industry  will  realize  an  1 1  %  CAGR  in  its  core  Level  1  business,  40%  CAGR  from 
Level  2  supervisory  control,  75%  CAGR  from  professional  services,  and  100%  CAGR  in  the 
Level  3  plantwide  automation  sector. 


The  absolute  numbers  are  debatable,  but  the  trends  are  clear.  Over  the  next  five  to  ten 
years  the  automation  market  is  expected  to  grow  by  well  over  50%  CAGR  in  the  consumption 
of  upper  level  and  inter-domain  applications  of  control.  Within  the  LI  control  domain  it  is 
expected  to  grow  less  than  1 5%.  Any  strategic  investment  in  technology  must  clearly 
support  the  development  of  Level  2  and  Level  3  control  software,  its  vertical  integration, 
and  its  attendant  professional  services  products. 
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The  definition  and  implementation  of  software  capable  of  correctly  controlling  a  plant  that  is 
hosted  on  a  distributed  computing  system  provides  a  significant  technical  and  marketing 
challenge.  Classical  programming  models,  methodologies,  and  tools  are  not  sufficiently  rich 
to  handle  either  the  complexity  or  volume  of  validated  code  that  can  be  hosted  on  distributed 
multi-processor  configurations  whose  nodes  routinely  support  40  bits  of  addressability.  A 
new  programming  model  is  required,  one  especially  focused  on  heterogeneous 
multiprocessing  of  mission  critical  applications. 

The  work  we  and  others  have  done  supports  adoption  of  a  virtual  machine  model  based  on 
objects,  ports,  and  threads  as  demonstrated  under  various  assumptions  'n  the  Mach 
[Rashid86]  and  Alpha  [Northcutt87]  micro-kernel  projects  at  CMU,  Chorus  from  Chorus 
Systemes,  S.A.  [Rosier92],  OSF/1  from  the  Open  Software  Foundation  [Leopere92] 
[OSF92],  and  the  Mach/RT  project  at  the  Center  for  High  Performance  Computing  (CHPC), 
Worchester  Polytechnic  Institute  [Shipman92].  We  will  return  to  the  implications  and 
application  of  this  programming  model  in  a  later  section,  but  will  first  consider  the  real¬ 
time  control  problem  domain  from  an  applications  level  perspective. 


3.2.1  Integrated  Vertical  Applications 

The  figure  below  defines  the  control  problem  space  for  the  next  generation  of  control 
systems  in  terms  of  object  g.-m:;larity  (centre!  span)  and  response  tim.e  (.'■elated  to  state 
variable  persistence  and  control  fidelity).  Industrial  automation  systems  must  be  capable  of 
hosting  applications  whose  access  rights  and  essential  resources  extend  from  L4  downward 
through  LO.  Guaranteed  response  times  must  have  lower  bounds  in  the  sub-millisecond 
range.  Measurement  and  derived  process-state  must  be  stored  for  as  long  as  decades.  The 
business  opportunity  expressed  here  is  to  provide  platforms  and  related  applications  that 
follow  the  arrow  up  the  commercial  ‘food  chain’  (as  depicted  in  the  previous  graph).  As 
hardware  platforms  and  system  operating  software  are  continually  rationalized  (i.e., 
standardized  and  made  commodities),  the  real  value  added  to  the  marketplace  will  be  i)  the 
LO-Ll  front  ends,  ii)  the  horizontal  and  vertical  control  applications,  and  iii)  the  attendant 
professional  services. 
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To  realize  systems  within  this  space  that  adhere  to  “hard  real-time"  [Jensen92]  operating 
constraints  we  require  a  new  approach  to  system  design.  The  development  of  control  policies 
and  mechanisms  that  engage  the  services  of  objects  arrayed  vertically  from  LO  to  L4 
requires  a  programming  model  that  is  fundamentally  different  from  that  employed  in 
contemporary  systems.  First,  the  model  must  provide  semantics  that  are  consistent  across 
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the  levels.  Second,  the  underiying  hardware  environment  must  have  known  performance  and 
reliability  measures.  Third,  the  system  must  be  scalable,  since  plant  control  policies  and 
mechanisms,  and  end-user  requirements  will  evolve  over  time.  And  fourth,  the  next 
generation  of  automation  and  control  systems  will  have  to  connect  to,  and  interoperate  with, 
the  thousands  of  installed  legacy  systems  already  in  place  world-wide. 

“Vertical  application*  refers  here  to  the  capability  to  engage  the  services  of  objects  within 
the  sensor-actuator,  regulatory,  supervisory,  and  plantwide  domains  on  behalf  of  specific 
plant  control  policies.  This  requires  defining  mechanisms  in  a  semantically  consistent 
manner  across  the  domains.  Continuous  emissions  monitoring  is  a  good  example  of  a 
commercially  relevant  inter-domain  application.  Applications  of  this  type  require  the 
processing  of  a  variety  of  data  types  with  wide  variations  in  temporal  granularity,  the 
preservation  of  measurement  profiles  on  long-term  stable  storage,  often  complex  man- 
machine  graphic  presentations,  and  relatively  complex  numerical  methods. 

In  heterogeneous  systems,  vertical  applications  require  adherence  to  inter-level 
programming  interfaces  up  and  down,  and  across  the  hierarchy.  POSIX  [Zlotnick91]  and 
DCE  [OSF92]  are  examples  of  well-defined  interfaces  pertaining  to  the  boundary  layer 
between  applications  and  operating  system  software  and  among  applications.  Such  interfaces 
must  be  defined  for  the  application  domain  as  well  (i.e.,  distributed  control  system 
semantics).  The  interfaces  may  or  may  not  be  standardized  a  la  ISO,  but  for  the  application 
to  be  stable  over  the  life  of  its  implementation,  the  interfaces  must  at  least  be  stable.  In  a 
homogeneous  computing  environment  the  inter-  and  cross-level  interfaces  will  be  stable  by 
definition. 

Representative  applications  identified  for  deployment  within  the  intra-plant  control  domain 
vary  by  industry  and  business  practice.  The  figure  below  gives  some  flavor  for  their 
character.  These  “multi-vendor  applications’  are  typically  bolted  together  to  satisfy  the 
needs  of  customer  sales-order  projects.  The  process  is  classical  systems  integration  work, 
generally  at  or  above  L2,  resulting  in  ad  hoc  solutions  with  limited  reuse  potential. 
Furthermore,  the  hardware  and  software  elements  above  L2  are  typically  outside  purchased 
equipment  and  under  the  design  control  of  third-parties.  Therefore,  the  total  operating  cost 
of  the  resultant  system  can  be  rather  high  given  the  maintenance  and  complexity  of 
upgrading  such  systems. 


Laval  3: 
Plantwida 

ToUi  OuaVyCorVDl 
ifwvnory  Mgrm 

Ftandil 

^roduol 

Mom 

L2  Sirncrvortullon 

PufCMno 

Ora»f  PVOOMMIQ 

SMy4aociirtr 
Emvoormruf  9fmam 
P*Q0Meton  SBfiodvIno 
nan  UWOm  Ugnp 

Ljpcntory  Oau 

SMtoacai 

PWa  E-MaK:4l»ndv 
Oftaa  Oocvman  Mpm 

Engr  Mont 

Bad  FVig  Synama 

Rad  Adfmnp 

Plan  Raiaaoral  oeMS 

TacP  Documart  MQfTt 
Uaiailai  Raq  nariang 
Capaoay  namng 
TiaravSXuaaeta 

Haalh  Salaly  Syaaira 
Communeaaana 

Laval  2: 
Suparviaory 

Am^TVtwSOC 
ProotM  Computmont 
AantetC  Arm  Cortrefe 

LI  SfnefvonaflUon 

Arap  Eraiyy  Mgml 
LjbontDry  Sanpmg 
Oqia— ■  Aiamtig 
PmosM  lAodoIno 

Cortrd  Room  Mgrm 
Araa  Scraoulng 
Rooaaa  OpQjntzaaon 
Arm  Aiamanp 

Emargarcy  STMdoam 
Baton  Ra^Mgmi 

Ada  Sadly  Syaiama 
CofnmuntcaSona 

Laval 

Ragulatory 

Oo—tj  Loop 

^jC  S«qMfielriQ 

Fowl  looMon 

LO  Sfntfirontutbn 

UopSMty 

Conaa  nadundancy 
Scan  «  Oaii  Couacuon 
CafAladiaia  Cortral 

Lap  Adofhaton 
Maawramad  Cafcrala 
RaguMor  Tvrvnp 
Prooaaa  AUndng 

Rooaea  WloOaIro 
Actuator  Caibraia 
Adapthd  Cortrd 
CommunicaOona 

Laval  0: 
Tranaducar 

Sonoon 

Aduittrt 

Condbonlno 

SyncfvortuUon 

Sanoon 
$fnart  Acfefttora 

Ortwaa 

Saff  Caibntrg 
Dugnoaaca 

Sanaor  Alannrg 
Adualor  AMrmlio 

Signai  Fldrtng 

Signal  Eaivnanort 
Corrvnunicatoni 

NATO  Advance  Study  Institute  on  Real-Ttme  Control 

178 


Page  1 2 


Octobers,  1992 


An  important  goal  of  the  next  generation  of  control  systems  is  to  “objectify"  these 
applications  through  the  definition  and  use  of  standard  interfaces  at  the  application  (e.g., 
DCE)  and  operating  system  (e.g.,  Mach)  levels.  This  should  result  in  moving  the  problem  of 
constructing  integrated  control  applications  from  a  programming  problem  to  a  software 
configuration  problem,  with  a  concomitant  reduction  in  total-installed  and  total-operating 
costs,  an  increase  in  reusability,  and  an  improvement  in  total  product  quality  and  customer 
satisfaction. 

3.2.2  Wider  Spans  of  Control 

Applications  with  wide  control  soans  (e.g.,  those  at  L3)  reach  across  many  lower  level,  yet 
autonomous,  control  domains.  These  applications  invoke  the  services  of  objects  at  levels 
above  and  below  the  end-user  domain,  but  the  execution  threads  spend  most  of  their  time 
wandering  the  object  domains  at  lower  levels.  The  semantics  are  a  mixture  of  client-server 
and  peer-to-peer,  neither  strictly  dominant.  Energy  management  within  a  multi-area 
plant  (e.g.,  a  pulp  and  paper  mill)  is  a  good  example  of  a  wide-span  control  problem  with 
cooperating  process  semantics,  but  also  containing  global  optimization,  or  client-server 
relations.  Energy  management  is  considered  a  L3  function  (by  the  nature  of  its  end-user’s 
management  position)  that  requires  services  from  many  essentially  independent  L2  and  LI 
control  objects. 

3.2.3  Greater  Control  Fidelity 

Higher  performance  systems  are  possible  by  the  nature  of  the  technology-cost  curves  we 
are  riding.  Higher  performance  begets  greater  speed,  higher  precision,  and  increased 
functionality  leading  to  greater  control  fidelity.  Increased  fidelity  implies  higher  loop- 
level  bandwidth,  increased  control  accuracy,  and  greater  persistence  of  abstract  data  types 
utilized  within  the  system.  Larger,  more  reliable  storage  allows  operating  history  to  be 
brought  to  bare  on  real-time  control  strategies,  facilitating  the  construction  of  expert 
systems  and  learning  automaton.  These  opportunities  are  clearly  control  level-dependent. 

At  LO  control  fidelity  refers  to  the  precision,  in  both  time  and  value,  of  the  input  and  output 
transducers.  At  LI  fidelity  concerns  the  accuracy  of  the  process  models  that  govern 
regulatory  control  policies.  LZ  fidelity  concerns  the  correct  synchronization  of  regulatory 
policies  on  behalf  of  supervisory  control  functions.  And  L3  fidelity,  albeit  more  coarsely 
grained,  provides  precision  to  computations  governing  supervisory  domain  scheduling  and 
associated  plant  coordination  and  control  functions.  Fidelity  in  these  various  control 
domains  is  also  application  specific.  The  next  generation  of  control  systems  should  therefore 
provide  for  the  specification  and  implementation  of  policies  and  mechanisms  governing 
domain-specific  f/delity. 

3.2.4  1 00%  Availability 

Availability  is  at  the  center  of  what  is  meant  by  mission-critical  control.  The  next 
generation  of  control  systems  must  be  inherently  survivable  under  a  broad  range  of  system 
failure  modes  and  plant  operating  conditions.  At  the  heart  of  the  survivability  issue  are 
requirements  for  fault-isolation,  graceful  degradation,  dynamic  reallocation  of  resources, 
and  task  migration  semantics.  Alpha  micro-kernel  semantics  [Clark92]  are,  for  example, 
particularly  well  suited  to  provide  the  underlying  distributed  operating  system  mechanisms 
for  implementing  high  availability. 

Availability  has  its  roots  in  hardware  design.  The  platform  must  be  constructed  to  support 
redundancy  of  processing  nodes,  communications  paths,  and  operator  interfaces.  The 
critical  factor  is,  once  again,  robust  system  and  applications  software. 
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There  are  three  levels  of  software  that  are  of  concern.  At  any  given  node  in  the  control 
network  there  will  exist  components  of  a  distributed  application  domain-independent  real¬ 
time  operating  system  (D/OS).  The  D/OS  provides  a  reliable  and  consistent  distributed 
system  programming  model,  or  virtual  machine,  to  the  next  layer  of  control  node  software. 

This  next  layer,  or  real-time  middleware,  comprises  control-domain  specific  services  that 
create  on  top  of  the  general  D/OS  virtual  machine  services  the  personality  of  a  homogeneous, 
fault-tolerant,  high-availability  plant  control  system.  This  personality  defines  a  consistent 
model  of  a  plant  control  system  (PCS)  that  is  supported  by  specific  hardware  features  which 
are  required,  but  not  necessarily  seen,  by  the  control  applications  that  are  at  the  next 
abstract  layer. 

The  top  layer  of  software  is  seen  by  the  PCS  programmers  who  are  tasked  with  creating 
solutions  to  customer  control  problems.  This  layer  of  programming  abstraction  is 
optimized  for  the  problem-domain,  and  presents  an  environment  where  availability  can  be 
assumed  (i.e.,  provided  by  the  underlying  layers)  and  abstracted  out  of  the  L0-L4  domain. 
Thus  the  D/OS  and  the  PCS  layers  are  responsible  for  providing  services  that  guarantee  the 
reliable  operation  of  the  control  platform,  and  the  application  programming  layer  is 
responsible  for  providing  services  that  control  the  plant. 

3.5  Lower  Per-Functlon  Cost 

The  cost  structure  of  contemporary  control  systems,  as  perceived  by  the  end-user,  is  a 
cntical  design  factor  that  today  is  centered  on  the  idea  of  ‘cost-per-point."  This  bias  has 
its  roots  in  the  low  level  sensor-actuator-regulator  manufacturing  roots  of  DCS  and  PLC 
vendors,  and  the  resulting  conditioning  of  the  market.  For  example,  analog  input  or  output 
costs  about  $50/point  today,  or  $800  for  a  16-point  multiplexed  A-to-D  card.  On  the 
basis  of  the  trends  discussed  in  Section  2.1,  a  30,000  tag  system  containing  10,000  I/O 
points  would  cost  $.5M  for  the  LI  front-ends,  not  including  the  LO  devices.  The  next 
generation  must  treat  this  costing  as  an  initial  ceiling,  and  follow  the  relevant  cost- 
performance-volume  curves. 

In  the  next  generation  of  control  systems  software  will  be  the  single  largest  cost  to  produce 
and  maintain,  so  packaging  and  pricing  strategies  will  have  to  be  significantly  altered  from 
those  used  today.  In  addition,  since  the  functionality  offered  at  LI  -L3  represents  layered 
software  services  (e.g.,  D/OS  ->  PCS  ->  specific  customer  control  applications)  there  are 
opportunities  for  “packaging  control"  in  new  and  creative  ways,  perhaps  bundling  it  with 
its  requisite  I/O,  and  offering  the  ensemble  as  a  formalized  subsystem.  These  issues  are 
open,  yet  relevant  to  the  development  of  the  next  generation  of  control  systems,  for  they 
establish  the  cost-per-function  profiles  which  govern  the  design  and  implementation 
decisions. 

3.6  Backward  Compatibility 

Backward  compatibility  is  an  absolute  requirement.  The  continuous  process  industries 
build  and  operate  plants  that  exhibit  10-15  year  half-lives.  The  automation  infrastructure 
of  those  plants  must  exhibit  the  same  installation  life-times.  Therefore,  the  next  generation 
of  control  systems  must  connect  to  and  interwork  with  these  legacy  systems. 

There  are  essentially  three  means  to  accomplish  this  end.  The  first  is  to  faithfully  emulate 
the  legacy  systems  (i.e.,  their  applications  and  hardware  characteristics)  within  the  new 
environment,  and  to  connect  to  the  older  low  level  I/O  systems.  The  second  is  to  host  only 
legacy  applications,  with  certain  restrictions  and  caveats,  and  ignore  the  faithful  emulation 
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of  legacy  hardware.  The  third  is  to  develop  a  formal  gateway  through  which  the  next 
generation  system  views  the  legacy  machinery  as  a  server.  A  formal  legacy  system  service 
must  be  provided  through  one  or  more  of  these  mechanisms,  most  likely  a  limited  version  of 
the  second,  and  definitely  the  third. 

4.  Programming  Semantics  for  Integrated  Controls 

The  semantics  we  propose  here  are  independent  of  hardware  platform.  The  hardware  is 
abstracted  out  of  the  programming  domain  by  the  underlying  distributed  kernel  and 
operating  system  services  (D/OS.)  The  hardware  environment  for  the  PCS  can  be  viewed  as 
a  multicomputer  system  comprising  one  or  more  multiprocessor  nodes  interconnected  by  a 
communications  network  that  makes  the  multicomputer  transparent  to  the  application.  The 
underlying  D/OS  provides  services  that  expose  the  distributed  nature  of  the  hardware  should 
the  application  require  it,  but  a  basic  principle  of  the  PCS  is  to  mask  from  the  applications 
any  access  to  the  real  hardware  of  the  machine.  The  stylized  run-time  environment  of  the 
Elsag  Bailey  machine  node  is  depicted  below. 


•fca  processing  nod*  configuration 


The  real-time  application  domain  of  the  PCS  is  graphically  depicted  below.  Market-specific 
applications,  such  as  gas  turbine  controls,  are  defined  in  terms  of  interactions  among 
specific  control  domain  objects.  These  objects  define  the  abstract  data  types  that  specify 
behaviors  of  control  elements,  according  to  some  application-specific  control  policies.  The 
various  industry-specific  object  libraries  will  share  many  common  elements,  from 
physical  plant  elements  (e.g.,  valves  and  motor  controls)  to  control  strategies  and 
mechanisms.  Furthermore,  L|  control  elements  may  inherit  object  properties,  defining 
higher  level  meta-objects.  This  expanding  scope  in  the  control  hierarchy  is  depicted  in  the 
figure  horizontally,  right  to  left.  The  figure  indicates  that  the  application  has  (potential 
direct)  access  to  objects  residing  at  each  control  level  (i.e.,  to  objects  with  varying  spans  of 
control).  These  capabilities  are  restricted  by  the  definition  of  the  application  and  its 
activation. 
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4.1  Micro-kernel  Level  Semantics 

The  underlying  kernel  semantics  are  based  on  a  fusion  of  the  Mach  3.0  micro-kernel 
[Loepere92]  and  those  features  of  the  Alpha  OS  [Clark92]  responsible  for  real-time 
resource  management,  as  being  implemented  in  Mach/RT  [Shipman92].  The  fundamental 
abstractions  supported  at  this  level  are 

•  task . the  unit  of  resource  allocation,  a  container  to  hold  references 

to  resources  (handles)  in  the  form  of  a  virtual  address  space, 
a  port  name  space  (set  of  port  rights),  and  set  of  threads 

•  thread . the  unit  of  processor  utilization  -  an  execution  point  of 

control  within  a  task  defined  by  a  program  counter,  register 
set  and  stack 

•  port . a  unidirectional  communications  channel  between  tasks, 

accessible  only  via  send/receive  capabilities 

•  port  set  .....a  set  of  ports  that  can  be  treated  as  a  single  unit  for  the 

purposes  of  receiving  a  message 

•  port  right . a  capability  allowing  a  task  to  exercise  certain  access  rights  to 

a  port 

•  port  name  space  ....an  indexed  collection  of  port  names,  each  of  which  names  a 

specific  port  right 

•  message....  a  typed  collection  of  data  objects  passed  between  two  tasks 

•  message  queue...  a  queue  of  messages  associated  with  a  given  port 

•  virtual  address  space  ...a  sparsely  populated  index  of  memory  pages  that  may  be 

referenced  by  the  threads  within  a  task 

•  memory  object . an  internal  unit  of  memory  allocation  that  represents  the  non¬ 

resident  state  of  the  memory  pages  backed  by  this  object 

•  memory  cache  object  ....a  kernel  object  that  contains  the  resident  state  of  the  memory 

objects 

•  processor . a  physical  device  (cpu)  capable  of  executing  the  threads  of  a 

task 

•  processor  set.. ...a  set  of  processors,  each  of  which  can  be  used  to  execute 

threads  assigned  to  the  set 
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•  node . an  individual  multiprocessor  within  the  PCS  multicomputer 

environment 

•  host . the  distributed  PCS  platform  hardware  taken  as  a  whole 

•  device a  physical  device  (resource)  available  to  a  user-mode  task, 

such  as  a  LO  transducer  available  to  a  LI  task 

•  event . an  (asynchronous)  signal  dispatched  by  the  kernel  to  zero  or 

more  tasks  executing  on  the  multicomputer 


4.2  D/OS  Level  Semantics 

The  semantics  of  a  distributed  real-time  operating  system  (D/OS)  suitable  for  the  next 
generation  of  plant  control  systems  (PCS)  is  subject  to  debate,  and  will  likely  be  of  a 
proprietary  nature.  The  control  environment  and  its  requirements  for  operating  system 
services  remains  largely  at  the  discretion  of  the  implementer,  since  the  general 
requirements  of  “open  systems’  do  not  directly  apply  to  the  domain  of  real-time  control. 
This  is  especially  true  within  an  industrial  control  setting  where  the  manufacturing 
processes  (e.g.,  recipes)  used  to  create  products  are  themselves  of  considerable  value.  With 
the  ability  of  mounting  guest  operating  systems  on  selected  nodes  of  the  PCS,  the 
requirements  for  openness  and  inter-operability  are  easily  achieved. 

Bailey  Controls,  in  conjunction  with  the  Elsag  Bailey  Process  Automation  Group  of 
companies,  is  currently  defining  D/OS  services  specifically  tuned  for  the  requirements  of 
industrial  continuous  and  batch  process  automation.  The  specific  details  of  this  layer  will  be 
defined  in  subsequent  papers,  but  several  of  the  salient  features  can  be  outlined  as  follows. 

Under  specific  configuration  rules,  the  kernel~D/OS  environment  will  form  the  basis  of  a 
class  of  virtual  machines  that  can  be  scaled  onto  physical  machines  ranging  from  LO  devices 
(e.g.,  pressure  transmitter)  through  L4  devices  (e.g.,  energy  management  controllers). 
These  control  subsystems  may  then  be  (incrementally)  blended  into  a  larger  control  system 
(“mesh”)  with  minimal  risk.  The  requirement  for  overall  system  scalability  demands  that 
D/OS  services  are  themselves  partitioned  in  such  a  way  that  control  nodes  can  be  added  and 
deleted  from  the  mesh  dynamically,  with  minimum  upset  to  the  operating  plant. 

The  physical  configuration  of  the  PCS  is  determined  by  a  control  problem-domain  rule  base 
that  takes  into  account  domain  complexity  (control  policy  semantics),  redundancy 
requirements  (availability),  inter-node  distances  (plant  topology),  and  speed.  These 
factors  contribute  to  the  specification  of  the  number  of  nodes,  their  respective  compute  and 
storage  capacities,  the  number  and  speed  of  interconnecting  links,  the  size  of  the  name- 
space  within  and  among  node  multiprocessors,  and  the  configuration  of  D/OS  services.  The 
aggregation,  at  one  end  of  the  configuration  spectrum,  is  a  uniprocessor  with  dedicated  I/O 
(a  small  “entry  level’  system).  At  the  other  end,  it  is  an  n-cube  mesh  of  multiprocessors. 
We  believe  that  the  practical  range  of  PCS  configurations  will  lie  in  the  1  -  and  2-D  mesh 
structures  that  support  n:l  and  1 :1  redundancy  among  processor  sets  that  define  (at  least) 
logical  nodes  dedicated  to  specific  plant  control  tasks. 

The  mesh  concept  views  nodes  (one  ore  more  processor  sets)  as  execution  sites  for  control 
policies  that  are  relevant  to  specific  control  domains.  Nodes  can  be  associated  into  node_sets 
that  host  the  regulatory  and  supervisory  applications.  Nodes  and  node_sets  generally  show 
affinity  by  the  nature  of  the  control  policies  that  execute  on  them,  and  by  the  resource 
scheduling  policies  required  for  the  timely  execution  of  those  policies. 
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A  control  domain  generally  has  both  physical  and  logical  connotations,  but  is  application 
specific.  The  D/OS  provides  middleware  services  relevant  to  developing,  configuring  and 
managing  the  suite  of  applications  required  to  manage  the  customer’s  plant.  These  services 
include: 


•  configuration  services . PCS 

•  domain  databases....  PCS 

•  operator  interfaces . PCS 

•  event  management...  PCS 

•  directory  services  . PCS 

•  security  services  . PCS 

•  filing  services  . PCS 

•  internetworking  . PCS 

•  archiving  services  . PCS 

•  reporting  services  . PCS 

•  control  policies . PCS 

•  transnode  scheduling PCS 

•  fault  recovery . PCS 

•  control  events . PCS 


configuration  specification  and  management  services 

problem  domain  configuration  services 

man-machine  interface  environment  and  tools 

system-wide  event  (alarm)  management  services 

name  services 

security  services 

file  system  services 

network  interface  services 

activity  and  data  logging  services 

trending  and  reporting  services 

control  policy  specification  services 

resource  scheduling  services 

fault  isolation  and  recovery  services 

event  response  policies  and  mechanisms 


Many  of  these  services  will  be  applications  that  reside  on  the  “guest  nodes*  of  the  PCS, 
since  they  are  services  that  are  er^er  required  during  the  development  and  commissioning 
of  the  customer’s  system,  or  are  not  engaged  in  the  direct  control  of  the  plant  and  can  run 
with  relaxed  availability  requirements.  Hence,  these  D/OS  services  are  resident  on  the  PCS 
host  multicomputer,  but  occupy  resources  on  only  a  subset  of  the  nodes  of  the  system.  The 
particular  nodes  engaged  in  supporting  these  services,  versus  the  node_set  responsible  for 
hosting  the  actual  plant  control  policies,  are  determined  at  configuration  time  and  are 
dynamically  modified  under  faults,  forced  reconfigurations,  and  dynamic  “what  if* 
conditions. 

4.3  Control  Application  Level  Semantics 

The  semantics  of  the  application  layer  of  the  PCS  are  based  on  i)  classical  linear,  sequential, 
sampled-data  control  [Houpis92],  ii)  advanced  non-linear,  stochastic,  optimal  estimation, 
filtering,  and  control  [Isermann91  ];  and  iii)  controls  based  on  heuristics  [Slagle71]  and 
such  non-traditional  mechanisms  as  neural  and  fuzzy  logic  controllers  [Kosko92].  These 
semantics  are  quite  rich  in  both  traditional  usage  and  theory,  and  this  paper  is  not  the  forum 
for  a  tutorial  review. 

In  the  situation  where  control  policies  require  the  services  (resources)  of  applications 
running  outside  of  the  execution  domain  of  the  distributed  operating  system,  those  services 
can  be  mounted  on  a  guest  operating  system  (e.g.,  NT,  VMS  or  OSF/1 )  and  run  as  a  server  on 
one  of  the  nodes,  or  one  of  the  processor_sets,  of  the  host. 

As  an  example,  a  continuous  emissions  monitoring  (CEM)  application  might  engage  the 
services  of  plantwide,  supervisory,  regulatory,  and  transducer  objects  of  the  P(^.  These 
objects  contain  (possibly  multi-threaded)  tasks  that  carry  work  on  behalf  of  a  number  of 
concurrently  executing  applications.  The  CEM  application  might  begin  its  life  by  invoking 
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activity  in  a  L3  object  (i.e.,  the  application-level  thread)  which  manifests  itself  as  possibly 
many  parallel  Mach  execution  threads.  The  CEM  application  progresses  along  this  two 
dimensional  trajectory  in  time,  sequentially  through  the  CEM  application  (a  la  the  client- 
server  paradigm),  and  concurrently  along  the  various  control  policies  as  instantiated  in  the 
subservient  control  objects  (a  la  peer-to-peer). 

The  figure  below  is  a  simplistic  depiction  of  the  “application  thread’  meandering  through 
the  distributed  objects  that  govern  PCS  control  policies. 


L3  Object 
invocatioo 
(eg,  user  request) 


L3  object  L2  object  LI  object  LO  object 


D 

virtual  machine 

real  machine 

This  figure  conveys  the  peer-level  semantics  of  control  policy  objects  whose  (ideally, 
provably  correct)  operations  are  restricted  to  a  given  problem  domain.  This  model  is  being 
used  to  test  the  concepts  of  dynamic  replacement  of  running  control  policies  by  substituting, 
in  real-time,  the  control  policies  governing  various  processes.  This  dynamic  control  policy 
feature  is  critical  to  fault  isolation,  load  balancing  under  upset  conditions,  policy  “what  ifs’ 
based  on  high  fidelity  simulations  against  the  running  plant,  and  other  related  applications 
requirements. 

5.  Conclusions 

Within  the  framework  presented  here,  there  are  a  number  of  open  conceptual, 
implementation,  and  technology-related  issues.  They  represent  ongoing  efforts  in  defining 
application  domain  and  platform  semantics  that  are  required  to  realize  the  next  generation  of 
plant  control  systems.  The  business  drivers  for  integrated  vertical  applications  in  an  open 
real-time  computing  environment  clearly  dictate  capabilities  not  found  in  current 
distributed  system  platforms,  but  which  we  believe  are  enabled  by  the  computing 
environments  defined  in  the  cited  work.  Key  to  these  new  computing  technologies  is  the  body 
of  work  related  to  the  real-time  distributed  resource  managem.?r:t  architecture  as  defined  in 
the  Mach  [Rashid86],  Archons  [Northcutt87],  and  Clouds  [Dasgupta88]  experience  of  the 
late  1 980’s.  This  work  has  led  to  the  merging  of  capabilities  in  the  standards-based  OSF 
environment,  especially  the  OSF  kernel  work  at  CHPC,  OSF  Advanced  Development,  and 
elsewhere. 

We  believe  the  experiences  gained  to  date  Justify  the  development  of  distributed  real-time 
platforms  based  on  non-traditional  scheduling  policies.  The  key  feature  of  a  new  class  of 
policies  is  their  basis  in  the  distributed  threads  model,  and  the  ability  to  associate  policy 
with  all  the  resources  of  a  computation,  even  if  such  policies  exist  across  processors  in  a 
processor_set.  This  trans_node  scheduling  capability  is  critical  to  the  implementation  of 
adaptable  configurations  that  can  be  restructurea  under  the  stochastic  nature  of  real-time 
mission  critical  computations. 
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KEY  ISSUE 


m 


Computational  Model 

Should  define  functionality,  concurrency,  termination, 
communication,  (refinement). 

Should  contain  attributes  to  represent  requirements 
such  as  release  time,  period,  deadlines  on  computation 
and  communication,  utility/benefit  etc. 

Should  contain  attributes  to  represent  derived 
properties  such  as  worst  case  resource  usage, 
completion  times  (worst  case  or  probabilistic), 
allocations,  priorities  etc. 

Should  allow  decomposition,  with  the  attributes  defined 
at  appropriate  levels. 

Should  recognise  that  all  computations/ 

communications  take  time. 

Should  facilitate  analysis  of  final  (and  incomplete) 
designs. 
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•  Should  allow  the  non-determinacy  in  the  environment 
to  influence  the  dynamic  behaviour  of  the  system. 


•  Should  not  force  premature  commitment  to 
configuration  issues. 


Above  requirements  lead  to  an  asynchronous  model  of 
interaction. 


Two  Design  Methods  have  been  built  upon  this 
computational  model: 

•  EDRT-HOOD  —  a  structured  method 

•  TAM  —  a  formal  method 
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HRT-HOOD 


Gives  explicit  support  to  common  object  classes: 

Active 

Periodic 

Sporadic 

Passive 

Protected 

Periodic  and  sporadic  objects  contain  a  single  thread  and 
communicate  via  passive  and  protected  objects. 

Attributes  define  deadlines,  offsets,  worst  case  response 
times  etc. 

Rules  of  decomposition  force  the  crucial  objects  is  a 
system  to  be  predicatable  in  their  worst  case  behaviour. 

Transformation  are  available  firom  URT-HOOD  designs  to 
Ada  9X  code  analysed  (off-line)  using  static  allocation  (of 
objects  to  processors),  and  static  priorities  with  immediate 
ceiling  priority  inheritance 


CONTROL  SOFTWARE 


Figure  3:  The  Control  Software 
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COf^TROLLER 


Further  decomposition  of  the  SENSOR  object  is  shown  in  Figures  5  to  9. 
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Further  decomposition  of  the  SENSOR  object  is  shown  in  Figures  5  to  9. 
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Figure  6:  The  IRES  Object 


The  IRES  sensor  controller  consists  of  two  precedent  constrained  cyclic  objects  with  a  time  offset 
between  the  two  implementing  the  required  synchronisation.  The  first  object,  REQUEST  IRES  DATA, 
sends  a  request  to  the  IRES  (via  the  serial  bus).  The  second  object  will  receive  and  interpret  the  sensor 
values.  The  relative  time  offset  between  the  task  releases  and  the  deadline  of  the  first  object  ensures  that 
the  sensor  device  has  a  chance  to  respond  (at  least  30  ms). 
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Current  Status 

First  version  of  HRT-HOOD  defined. 

Based  on  HOOD  V3 

Mappings  defined  for  Ada  9X 
(December  1991) 

Mappings  defined  for  Ada  83  plus 
ARTEWG  CIFO-like  entries  for 
added  functionality 

UK  DRA  are  supporting  a  project 
which  will  generate  CASE  tool- 
support  (prototype  tools  only) 

New  version  of  HRT  HOOD  this  year 
to  address  HOOD  3.1  and  final 
version  of  Ada  9X. 
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Future  NASA  Projects 


Overview  of  Projects  and  Concepts  Under 
Consideration  for  Future  NASA  Projects 


PRESENTATION  TO 

Real  Time  Workshop 
Institute  for  Defense  Analyses 
Alexandria,  Va. 

MARCH  1S,  1993 


Ed  Chavars 

Deputy  Chlal/Information  Selancaa  DIvIclon 
NASA  Am««  Rasaarch  Canlar 


(\i/\SA 

'  Ames  Research  Center  ‘ 


NASA  Strategic  Intent 


•  Expiore  Space 


•  Support  improvement  in  Competitiveness 


•  Reduce  Operational  Costs 


V 


iVIASA 

Ames  Research  Center 
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Space  Challanges 


Develop  and  deploy  advanced  methods  of  systems 
engineering 

-  Span  acroaa  multiple  programs 

-  Lead  to  smarter,  faster,  and  cheaper  systems 

Compete  effectively  in  an  expanding  international  global 
marketplace 


Effective  utilization  of  cutting  edge  technology  along  with 
proven  technology  to  support  cutting  edge 
accomplishments 


l\l/\SA 

Amea  Research  Center 


Space  Trends 


Increased  emphasis  on  cost-driven  (as  opposed  to 
performance  driven)  projects 


Flight  and  ground  infrastructure  viewed  as  an  integrated 
system 


Development  of  integrated  system  engineering 
environments 


Avionics  open  architectures  with  standard  interfaces, 
modularity  and  commonality  concepts 


MASA 

Ames  Research  Center  - 


Space  Concepts 


.  EARTH -TO -ORBIT TRANSFER 

-  Shuttle  Upgrade 

-  Revised  NLS  (National  Launch  System) 

-  New  Concepts 

.  ASTROPHYSICS 

-  Miniature  Spacecraft 

-  SORA 


•  PLANETARY  EXPLORATION 

-  Habitats 

-  Rovers 


f\IA5A 

Ames  Research  Center 


Earth-to-Orbit  Transfer 


•  Shuttle  Upgrade 

-  Advanced  solid  rocket  motors 

-  Extended  duration  (30  days  on-orbit) 

-  Enhanced  navigation  system 

-  All  electronic  cockpit 

•  Revised  NLS 

-  Smaller  engines  (30-75  K  pound  payload) 

-  Larger  engines  (150-200  K  pound  payload) 


l\l/^SA 

Ames  Research  Center 
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Earth-to-Orbit  Transfer  (Cont’d)  ^ 


•  New  Concepts 

-  Single  Stage  to  Orbit  (SSTO)  Rockets 

-  Two  Stage  to  Orbit  (TSTO)  Rockets 

-  Combination  Air  Breather/Rockets  (Single  and  two  stage) 

>  NASP  darIvallVM 

>  Dual  maniMd  boostar/orbitar 

-  Small  version  of  Shuttle 

>  20  K  payload  va:  30  k  today 

•  20  tool  payload  bay  ya:  60  tool 

•  Mannad  or  unmannad  oparatlon 

»  Extanalva  bullt-ln  last  and  chackout 
■  Inlaractiva  oparatlona  with  ground  procaaalng  ayatam 

•  Adaptiva  guldanca  and  control 

»  Slgnlflcanl  raductlon  In  dapandancy  on  Mlaalon  Control 


l\I/\5A 

■  Amea  Research  Center 


Astrophysics 


•  Miniature  Spacecraft  Project 

-  FY95  new  start  to  develop  capability  to  support  wide  range  of 
instruments  (visual  to  millimeter  wave  length)  for  astrophysics 

-  Reduce  size,  weight  and  power  enough  to  drop  1  or  2  sizes  in 
booster  requirements 

-  Plan  on  more  but  smaller  payloads,  even  if  overall  costs  are 
not  reduced 

-  Plan  on  evolution  over  life  of  multiple  launches 


SOFIA 

-  Replacement  for  Kuiper  flying  observatory 

-  3  meter  telescope  (1  meter  in  KAO) 

-  New  version  of  747  for  higher  altitude 

-  Longer  flight  time 

-  Enhanced  on-board  processing  ss  a  a  m  a 

IMASA 

Amea  Research  Center 
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Planetary  Exploration 


Rovers 

-  Wheeled  vehicles  for  Moon/Mars 

-  Submersible  vehicles  for  Earth's  oceans 

Habitats 

-  Return  to  the  Moon  is  stepping  stone  to  real  goal:  Mars 

-  Lunar/Mars  habitat  should  be  designed  for  real  time 
distributed  operation  from  start 

-  Ability  for  in  situ  sophisticated  science  data  analysis 

-  Real  time  Earth>Planet  data  analysis/interaction 

-  Fully  autonorrwus  automated  habitat  living  environment  to 
allow  maximum  science  return 

Ames  Human  Exploration  Demonstration  Project  (HEDP) 

(UASA 

■  Ames  Research  Center 


Submersibles 

Submersible  vehicles  for  Earth’s  lakes  and  bays 

-  Relatively  complex  vehicles  being  developed  for  exploration 
of  dry  lake  beds  in  Antarctica 

•  H«lm*(  mounted  diaptey  ■yatam*  for  camara  conirol  (vahlcla  naxt  yaar) 
»  Smart  Inatrumanta  tor  undarwater  analytical  aupport 

-  Similar  technology  being  applied  to  underwater  research  in 
joint  Ames/Monterey  Bay  Aquarium  project 

■  Multipte  aanaor  fuaion 

•  3-0  grapfilca  diapteya 

•  Smart  manipulalora 

-  Ail  of  these  systems  are  pushing  state-of-the-art  in  real  time 
control  and  data  manipulation 

-  vision  prooaaalng  la  primary  drtvar 

■  Intaraetlon  batwaan  multipla  actenllst  and  llva  scans  and  a  virtual  scans 
gsnsralad  In  raal  Uma 


IW\5A 

'  Ames  Research  Center 
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Rovers 


Rover  vehicles  for  Moon/l>!ars  may  have  wheels  or  legs 

-  No  clear  decision  criteria  at  this  time 

-  Ames  research  focus  on  howto  make  the  vehicle  smart  and 
not  on  the  mechanical  system 

•  Mlaalon  planning 

■  Path  planning  and  tracking 

■  Taak  achaduling  (raachadullng) 

>  In  attu  data  analyaia  with  smart  Instrumanta:  dIKarsntlal  thsrmal  analyzar, 
gas  chromatograph,  X-ray  diffraction 
»  Data  compraasion  tachniquss 

•  Vahicia  status  monitoring,  faiiura  datsctlon  and  raconfiguration 


(W\SA 

Ames  Research  Center 


Summary 


•  There  are  a  wide  range  of  vehicles  and  systems  under 
consideration  for  future  projects  at  NASA  that  will  push  the 
state  of  technology  in  real  time  system  operation 

•  This  talk  has  not  addressed  some  of  the  short  term  issues 
such  as  upgrades  to  the  Shuttle  Mission  Control  Center 
and  development  of  the  Space  Station  Control  Center 
which  do  require  elements  of  real  time  distributed  control 

•  Flight  projects  have  been  stressed  because  these  systems 
have  limited  resources,  yet  still  require  some  'evel  of 
complexity  to  be  found  in  large  ground  based  systems 


V  IMASA 

Research  Center 
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Lunar  Habitat  Deployed  Configuration 


JSC  Lander 


Integrated  HE  DP  Systems 


Ames  Research  Center 
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HEDP  Evolution 


li'lTEGrlATED  rlAGITAT  Ames  Research  Center  rlUlMAi'l  POV/EflED  CEi'ITfllFUaE 
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1192  Elena  Privada 
Mountain  View,  CA  94040 
Tel./FAX:  (415)  968-3476 
E-maii;  armen@well.sf.ca.us 
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Gabrielian 
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-  Too  expensive  to  develop  &  test  systems 

-  Critical  nature  of  many  embedded  systems 

•  Shortcomings:  Complexity  &  esoteric  notation 


Taxonomy  of  Formal  Systems 
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Finite-State  machines  (FSM’s) 


Representing  States  &  Behavior 
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(best-known  example:  Statecharts) 


Representing  Temporal  Constraints 
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•  Assumptions  about  future  behavior  of  system 
and  environment 

-  Strictiy  self-referential  view  of  time  deiays 


Views  of  Nondeterminism 
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Need  to  represent  temporal  uncertainty 


HMS  Machines 

Hierarchical  Multi-State”  Machines 
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Multiple  Verification  Methods 


=  A  Language  for  Time 


<-120,-2>“Message  received”  =»  “Data  Ava 


Characteristics  of  HMS  Machines 
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•  Transitions  expiicitly  designated  as  deterministic 
or  nondeterministic.  For  simuiation,  probabiiities 
may  be  associated  with  transitions. 
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Verifying  Correctness 
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system  failure  (SF)”  state: 

Property  violated  o  SF  =  True 


Verification  Strategy  and  Methods 
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Interactive  simulation 


Parallel  and  Distributed  Systems 


complicated  to  analyze 

Reasonable  assumption:  local  synchronous 
clocks  with  uncertain  communication  deiays 


The  Future 
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Dealing  with  complexity 
Automation 
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Verification  &  Refinement 
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TCEL  (Time-Constrained  Event  Language) 
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Step  1:  Section  Generation 
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Redesigned  RTS  Throughput 
On  the  SUN  Sparcstation,  Howes'  redesign  of  RTS  had  hi 
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Ida's  RTS  Throughput  on  Multprocessor 

Beyond  two  processors,  we  do  not  measure  an  increase 
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Finding  with  Howes*  Redesign 
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Excessive  calling  due  to  program  design 
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Unsafe  Timing  in  Task  Manage_Temperature_Reading 
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Nielsen  and  Shumate's  design  of  RTS  is 
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Real  Time  Design  Methodology  Assessment 
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LaDPART  Assessment  1 

•  Academic  compartmentization  of  "real-time"  definition 
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LaDPART  Systemic  Issue  of  Computer  Technology  overshadows 
— Overall  Design  Approach 
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Real-Time  Design  Methods  in  the  (  IDA 
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(Howes,  1990),  and  (Sanden,  1990). 
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Current  real-time  scheduling  theories  ( IDA 
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geting”  to  “design  in”  predictable  timing  behavior. 


The  time-line  model  ( IDA 
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Evolution  of  real-time  models  ( IDA 
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-  Vertically  Fartitioned  Objects  (Hufnagel  &  Brown,  1989) 

-  Entity  Life  Modeling  (Sanden,  1 990) 


Quotation:  Stephen  Hufnagel  &  James  Brown,  (  IDA 
IEEE  TOSE,  1 5(  1 989)  8,  p.  935 
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Quotation:  Stephen  Hufnagel  &  James  Brown, 
IEEE  TOSE,  15(1989)  8,  p.  938 
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Some  experiments  we  conducted 
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How  the  comparisons  were  made. 
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ORGINAL  NIELSEN  &  SHUMATE  DESIGN  OF  THE 
REMOTE  TEMPERATURE  SENSOR 
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Ways  of  introducing  concurrency  into  a  design 
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Minimai  concurrency 

-  to  optimize  the  performance  on  sequentiai  machines  and  to 
minimize  context  switching 


UtblUN  1 1  ^RATION  BASED  ON  REDUCING  MEAN 

SERVICE  TIMES 
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N  &  S/GOMAA’S  DESIGN  OF  THE  ROBOT 
CONTROLLER  SYSTEM 

CP_Processing 

■"  - - Control_Panel 

Bu(fer_Panel_lnput  _ 
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I  REDESIGN  OF  THE  ROBOT  CONTROLLER  SYSTEM 

USING  NEW  PRINCIPLES 
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Sensor  Sensor 

Output  Input 
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Conclusions 
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The  physical  concurrency  principle  has  a  concep¬ 
tual  parallel  with  object  oriented  design. 
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Realtime  OS's  Which  Span  Wide  Ranges  Of  Applicability 
Require  Certain  Attributes  To  Be  Highly  Scaleable 

□  Many  realtime  users  need  and  want  a  single  OS  archi¬ 
tecture  whose  instances 

*  have  a  wide  span  of  applicability 
a  from  non-realtime 

a  to  centralized  tactical  realtime  subsystems 
a  to  decentralized  mission -critical  systems 

*  with  consistent 
a  interfaces 

a  functional  components 
a  development  tools 

□  Such  an  os  architecture  must  be  highly  scaleable  with 
respect  to  a  number  of  its  attributes — 

most  importantly  including 
4  functionality 

*  performance 
^  timeliness 

*  predictability 

*  decentralization 

*  fault  tolerance 

because  it  is  not  cost-effective,  or  even  possible,  to 
build  a  one-size-flts-all  OS  for  that  wide  span 


•a«MH«kl4  tWI 


W*a 


dplg  i|tla  I 


p  I  g  i(ta  i 


No  One's  Current  Realtime  OS  Is  Very  Scaleable 


□  No  ones  current  realtime  OS  architecture  or  products 
are  more  than  just  modestly  scaleable 

^  4  each  specUic  OS  is  suitable  only  for  some  relatively 

small  range  in  the  realtime  application  spectrum 

♦  where  the  present  state  of  the  art  is  typified  by 


m  DEC  03F-1  (and  its  competitive  realtime  UNDC 
counterparts) 

intended  for  full-function,  centralized,  low  per¬ 
formance,  low  timeliness,  low  predictabiUty,  low 
fault  tolerance,  realtime  systems 


a  DECelx  (and  its  competitive  realtime  executive 
counterparts) 

intended  for  reduced  functionality  (embedded), 
centraUzed,  high  performance,  moderate  timeli¬ 
ness,  moderate  predictability,  low  fault  toler¬ 
ance,  realtime  systems 


An  OS  Architecture  Can  Be  Scaleable 
In  Many  Different  Respects 

□  Such  an  OS  tu'chitecture  must  be  highly  scaleable  with 
respect  to  a  number  of  its  attributes — 

most  importantly  including 


lunciiiinaliiy 


*  performance 
4  timeliness 
4  predictability 
4  decentraU  ration 
4  fault  tolerance 

□  Understanding  the  range  and  thus  the  scaleability  of 
each  of  these  attributes  necessitates 

4  an  improved  understanding  of  the  attributes 
themselves 

4  and  thus  the  abiUty  to  think  and  communicate 
more  clearly  about  them 

than  is  normally  done 

□  To  this  end,  we  must 

4  clairfy  some  extant  terminology 
4  add  some  new  terminology 


m  H«»  14  IMP  •  a  a 


319 


OS  FunctioMlily  l«  ScalMbI*  To  Tho  Extant  That 
It  Can  Ba  Change  Coharantly  Ovar  Tha  Spactnim 


An  Exampla  Of  Highly  Scaiaabla  OS  Functionality 
Is  Tha  Usa  Of  A  Microkamal>Basad  OS 


□  os — ^particularly  realtime  os — ^functionality  ranges 
from 

none — some  realtime  systems  have  no  OS  at  all 

*  to  simple  embedded  realtime  subsystems  with 
minimal  OS  (executive)  functionality 

*  to  complex  realtime  systems  with  extensive  OS 
functionality 

□  An  OS  architecture  is  functionally  scaleable  to  the  ex¬ 
tent  that  its  instantiations’ 

*  functionality  can  be  increased  and  decreased — 
e.g.,  to  match  the  requirements  of 

a  the  applications 
■  trade-offs  for  other  attributes 

*  functional  interfaces  increase  and  decrease  coher¬ 
ently  (i.e.,  superset  and  subset)  with  the  function¬ 
ality 

*  code  base  increases  and  decreases,  instead  of  hav¬ 
ing  to  be  replaced,  with  the  functionality 

□  Functional  scaleability  may  be  static  (at  configuration 
time)  or  dynamic  (at  execution  time) 

□  Functional  scaleability  is  particularly  important  be¬ 
cause  scaleability  of  other  os  attributes  depends  on  it 

lliUifct  >iiK Muxa  14. 1—  t It  1 


□  A  microkernel  can  be  the  basis  of  an  os  architecture 
with  high  functional  scaleability 

□  A  microkernel  can  support  (normally  statically) 

a  different  functional  “servers*  as  needed 

■  OS— e.g.,  diskless  executive,  reduced  functional¬ 
ity  OS,  full  UNIX 

■  system  software — e.g.,  networking,  data  man¬ 
agement 

■  application  software 

*  different  API’s  (“personalities")— even  at  the  same 
time 

^  direct  use  of  the  native  kernel  interface 

□  A  microkernel  Dike  any  kernel)  can  be  designed  to  per¬ 
mit  system  or  application  software  which  has  been  de¬ 
veloped  in  user  space 

to  be  executed  in  the  kernel  address  space 
a  this  improves  performance  of  time-critical  code 
without  the  higher  costs  of  developing  kernel  code 

(although  this  is  heretical  to  some  microkernel  pur¬ 
ists — at  least  outside  of  the  realtime  field) 
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An  Example  Of  Highly  Scaiaabla  OS  Functionality 
la  Tha  Usa  Of  Policy/Machanism  Separation 


An  OS  Architecture  Can  Ba  Scaiaabla 
In  Many  Ditlarant  Raspacts 


□  The  separation  of  resource  management 

*  mechanisms 

a  standard  application-neutral  building  blocks 
a  e.g.,  in  the  kernel 

*  policies 

a  application-specific  algorithms 
a  e.g.,  in  clients  of  the  kernel 

is  an  effective  technique  for  achieving  a  high  degree  of 
functional  scaleability 

□  For  example,  policies  can  be  separated  from  mecha¬ 
nisms  for 

*  processor  scheduling 

*  transactional  data  management — correctness, 
concurrency  control,  permanence 

*  secondary  storage— consistency,  recovery,  inter¬ 
face  semantics  (e.g.,  objects,  files) 

*  virtual  memory — paging 

□  Policies  may  be  selected  dynamically  (i.e.,  at  execution 
time)  or  statically  (i.e.,  prior  to  execution  time) 

□  ’This  is  essentially  the  dual  of  the  more  widely  used 
principle  of  information  hiding 


□  Such  an  os  architecture  must  be  highly  scaleable  with 
respect  to  a  number  of  its  attributes — 

most  importantly  including 
*  functionaUty 
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*  timeliness 

*  predictability 

a  decentralization  0 

*  fault  tolerance 
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Raaltiine  OS  Performance  Is  Commonly  Measured 
In  Terms  Of  The  Magnitudes  Of  Response  Times 

□  Commercial  realtime  os  suppliers  and  users  common¬ 
ly  consider  performance  in  terms  of  the  magnitude  of 
the  time  to  initiate  a  computation  which  is 

*  newly  released — the  metric  is  “interrupt  response 
time’ 

(preemptive  scheduling  is  the  norm) 

^  already  released — the  metric  is  the  “context 
switch  time*  subset  of  interrupt  response  time 
(regardless  of  whether  scheduling  is 
a  preemptive — the  usual  case 
a  or  non-preemptive) 

□  Performance  as  interrupt  response  time  presumes  the 
fact  that  most  realtime  os’s  are  globally  asynchronous 
in  the  sense  of  being  event-driven; 

a  minority  of  os’s  seek  determinism  by  heiag  globally 
synchronous  in  the  sense  of  being  time-driven, 

and  thus  don’t  have  interrupts 
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By  Theaa  Metrics,  High  Performance  Is  10-100  pSec 

□  By  these  metrics,  using  contemporary  processor 
speeds 

*  “high’  realtime  performance  is  typically  regarded 
to  be  on  the  order  of 

a  10  microseconds  for  Umited-functionality  execu¬ 
tives 

a  100  microseconds  for  full-functionality  os’s 

♦  “low*  realtime  performance  is  typically  regarded 
to  be  on  the  order  of  looo  microseconds  and  more 
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Realtima  OS  Parformanos  Is  Bast  Maasurad  InTarms  Of 
Tha  Magnitudes  Of  Computation  Completion  Times 

□  The  performance  of  realtime  os’s  is  more  usefully 
measured  in  terms  of 

♦  the  end 

*  rather  than  one  of  the  means: 

the  magnitude  of  the  computation  completion  times 
(such  as  deadlines) 

which  can  be  attained  with  given  hardware 

□  This  is  the  performance  metric  that  deterministic  OS’s 
use — 

and  globally  synchronous  ones  must  use  (since  they 
have  no  interrupts) 
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A  Realtima  Computation  Has  A  Tima  Constraint 

□  We  deflne  a  realtime  computation  to  be  a  segment  of  a 
computational  entity  (such  a  thread,  task,  or  process) 
subject  to  a  time  constraint 

□  A  time  constraint  is  the  relationship  between 

*■  when  a  realtime  computation  completes  execution 
S  the  temporal  merit  of  that  computation 
e.g.,  in  the  classical  deadline  case 
S  completing  before  the  deadline  time  is  better 
*  completing  after  the  deadline  time  is  worse 

□  A  time  constraint  is  manifest  in  the  computation  pro¬ 
gram  as  a  demarcated  region  of  code  whose  execution 
completion  time  is  subject  to  the  time  constraint — 

e.g.,  the  computation  must  complete  execution  of  the 
region  before  the  deadline  time  arrives 

BEGIN  TC  IDL  30  mSI 


AMllm  Canpvmon 


END  TC 

Otherwise  it  must  suffer  an  exception  condition 
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Th«  MagniludM  Of  Compl«lion  Tbn*  Corwtrainis 
Ara  Not  Spodfiod  For  Tbo  Majorfly  Of  OS's 


Highsr  Porformanco  Of  Moans  Is  Usually 
Nscassary  But  Not  Sufficiont  For  That  Of  Ends 


□  The  performance  of  realtime  OS’s  in  terms  of  the  mag¬ 
nitude  of  the  computation  completion  time  constraints 
is  usually  unspecified 

because  computation  completion  time  constraints  are 

*  not  yet  supported  directly  by  most  commercial  os 
products,  which  deal  instead  with 

a  starting  computations  as  fast  as  possible 
a  “priorities* — ^whose  semantics  are  user-defined 

*  left  to  the  users  to  satisfy  by  assigning  and  manip¬ 
ulating  priorities  and  resources 

□  This  is  due  to 

*  the  historical  context  of  realtime  computing 

a  the  paucity  of  processing  power  and  memory — 
made  direct  control  by  the  user  necessary 

a  the  simplicity  of  low-level  sampled-data  central¬ 
ized  subitems — 

made  direct  control  by  the  user  possible 

*  user  and  vendor  habituation  despite  evolutionary 
changes  in  both  aspects  of  that  context 

□  However,  computation  completion  time  constraints  in 
the  form  of  deadlines — e.g.,  priority  ceiling  protocols 
for  rate-monotonic  scheduling — have  now  appeared  in 
POSK  10aQ.4B 


□  Higher  performance  in  the  sense  of  interrupt  response 
time  is 

*  necessary  (in  the  usual  asynchronous  cases) 

*  but  not  sufficient 

for  higher  performance  in  the  sense  of  computation 
completion  times 

□  An  effective  technique  for  improving  performance  in 
the  sense  of  interrupt  response  time  is  kernel  pre- 
emptability 

□  A  key  factor  in  performance  in  the  sense  of  computa¬ 
tion  completion  times 

is  how  resource  dependencies  and  conflicts  among 
computations  are  resolved— e.g.,  by  the  scheduler 


OS  P«rfonnanc«  It  Sctltable  To  The  Extent  That 
It  Can  Be  Changed  At  Lower  Functional  Granularity 

□  A  os  architecture’s  performance  is  scaleable  to  the  ex¬ 
tent  that  its  instantiations'  performance 

tc  can  be  increased  and  decreased  in  magnitude — 
e.g.,  to  match  the  requirements  of 

a  the  applications 
a  trade-offs  for  other  os  attributes 

♦  at  lower  levels  of  functional  granularity— -e.g.,  of 
the 

a  individual  OS  or  application  computations 
a  vs.  os  functions  or  services 
a  vs.  the  OS  as  a  whole 
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Functional  ScaleabilityAffacts  PerformanceScaleability 

□  Some  kinds  and  degrees  of  performance  scaleability 
are  more  advantageous 

if  they  are  achieved  without  affecting  functional  inter¬ 
faces 

□  Performance  scaleability  is  generally  facilitated  by 
greater  functional  scaleability — e  g., 

*  higher  realtime  performance  (in  any  sense)  is  eas¬ 
ier  to  achieve  in  os’s  having  less 

m  functionality  in  general 
a  of  certain  functionality  in  particular 

*  realtime  performance  is  generally  greatly  affected 
by  resource  management  policies— e.g., 

m  scheduling 

m  concurrency  control  and  synchronization 
a  virtual  and  physical  storage  management 
in  any  given  appUcation  context 

□  Performance  may  be  scaled 

*  statically — e.g.,  configuring  a  different  file  system 

*  dynamically— e.g., 

a  turning  preemption  on/off 
a  selecting  a  different  scheduling  algorithm 
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Higher  Performance  Does  Not  Necesearily  Imply 
Meeting  All  Loiver  Performance  Requirements 

□  Providing  higher  performance  in  the  sense  of  shorter 
interrupt  response  time 

*  also  meets  lower  performance  requirements 

e  which  implies  performance  scaleability  means 
only  the  ability  to  scale  up  to  higher  performance 

□  But  providing  higher  performance  in  the  sense  of 
shorter  computation  completion  times  does  not  neces¬ 
sarily  imply  meeting  lower  performance  requirements 

e.g.,  a  particular  realtime  scheduling  algorithm 
may  provide  acceptable  completion  times 

■  for  a  set  of  computations  whose  time  constraint 
magnitudes  are  relatively 
0  short 
0  uniform 

(and  thus  constitute  a  a  higher  performance  re¬ 
quirement) 

a  but  not  for  a  set  of  computations  whose  time 
constraint  magnitudes  are  widely  varied  from 
short  to  long 

(which  constitutes  a  lower  performance  require¬ 
ment) 

*  this  demonstrates  that  performance  scaleability 
in  general  means  both  up  and  down — 

which  calls  for  scaleable  scheduling  policies 


An  OS  Architecture  Can  Be  Scaleable 
In  Many  Different  Respecta 

□  Such  an  OS  architecture  must  be  highly  scaleable  with 
respect  to  a  number  of  its  attributes — 

most  importantly  including 
*■  functionality 
♦  performance 


*  predictability 

*  decentralization 

*  fault  tolerance 


Timelinesa  Is  The  Basis  For  Realtime  Scheduling 

□  We  consider  the  timeliness — i,e.,  temporal  merit — of 
computations  to  be  the  principle  basis  for 

^  *  specifying 

^  scheduling 
*■  evaluating 

computation  completion  times 


We  define  timeliness  with  a  framework  consisting  of 
three  relationships  (e.g.,  functions) 


A  Timeliness  Framework  Is  Comprised  Of  Three  Parts 

□  Each  realtime  computation  has  a  time  constraint — 
i.e.,  a  relationship  between 

*  when  the  computation  completes  execution 

*  the  resulting  temporal  merit — timeliness — of  that 
computation 

(e.g.,  for  the  classsical  deadline  time  constraint, 
lateness  =  completion  time  -  deadline) 

□  A  collective  temporal  merit  relationship  defines 

4c  the  collective  timeliness  of  a  set  of  computations 

a  in  terms  of  the  individual  timeliness  of  all  its  con¬ 
stituent  computations 

(e.g.,  the  number  of  deadlines  met — i.e.,  with  negative 
lateness) 

□  A  collective  temporal  acceptability  relationship  defines 

4c  the  acceptability — in  an  application -specific  met¬ 
ric 

*  of  the  completion  times — ^predicted  or  experi¬ 
enced — for  a  set  of  computations 

expressed  in  terms  of  their  individual  or  collective 
timeliness 

4c  for  specified  system  and  application  states 
(e.g.,  acceptable  means  always  meeting  all  deadlines) 
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Tlin«linM«  For  Clauical  Daadlina  Tima  Conatraints 
la  In  Tarma  Of  Tardinaaa 


Tha  Traditional  Hard  Doadlino  Caaa  Allowa  Only  For 
Binary  Timalineaa  And  Acceptability 


□  The  classical  deadline  time  constraint  (i.e.,  in  schedul¬ 
ing  theory)  employs 

*  lateness  =  completion  time  -  deadline 

*  or  tardiness  =  positive  lateness 

as  its  individual  measure  of  timeliness 


rBlesss  time  tieadline 


run  time 

negative  ^  -  '< 

! - ^positive  (tardiness) 

lateness 

□  The  collective  timeliness  relationship  of  a  set  of  com¬ 
putations  having  classical  deadline  time  constraints  is 
most  frequently  chosen  to  be  one  of  the  following 

*  the  occurrence  or  not  of  at  least  one  tardy  (positive 
lateness)  completion 

*  the  number  of  tardy  completions 

*  the  mean  lateness 

□  Classical  deadline-based  scheduling  theory  often  im¬ 
plicitly  presumes  that 

collective  temporal  acceptability  is  equivalent  to  col¬ 
lective  timeliness 


□  The  traditional  realtime  computing  interpretation  of 
‘hard*  deadlines  implies  restrictions  of  timeliness  to 

*  a  binary  special  case  of  the  deadline  time  con¬ 
straint — ^timely  and  untimely 


release  lime  deadline 


run  time 

untimely 

n  mo)  » 

%  a  binary  collective  timeliness  relationship 

a  untimely;  the  occurrence  of  at  least  one  tardy 
completion 

a  timely;  otherwise 

*  a  binary  measure  of  collective  temporal  accept¬ 
ability 

a  acceptable;  no  occurrence  of  tardy  completions 
(unanimous  optimum)  under  any  conditions 

a  unacceptable;  the  occurrence  of  at  least  one  tar¬ 
dy  completion  under  any  conditions 

where  the  semantics  of  “unacceptable"  are  specific 
to  the  computation  and  application — e.g., 

a  non-productive 
a  counter-productive 
in  some  way 
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Often  Time  Constraints  Are  Not  Binary 
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Often  Collective  Timeliness  Is  Not  Binary 


□  Often  it  is  very  useful  or  necessary  to  have  softer — i.e., 
non-binary — time  constraints 

□  A  common  example  of  such  a  softer  time  constraint; 

if  a  particular  computation  cannot  be  completed  at  a 
time  of  optimal  merit — i.e.,  before  its  ‘predeadline' 

♦  completing  it  a  little  “tardy"  has  reduced  merit — 
but  is  better  than  not  completing  it  at  all 

S:  however,  completing  it  actually  tardy  (after  its 
deadline)  has  negative  merit — i.e.,  is  worse  than 
not  completing  it  at  all 

release  time  ‘predeadline"  deadline 


run  lime 

optimal 

;  suboptirrmi 
merit 

negative 

merit 

merit  •  ; 

□  Some  softer  time  constraints  are  routinely  handled  in 
terms  of  lateness  with  scheduling  theory — 

but  the  linearity  of  lateness  greatly  limits  the  inter¬ 
pretation  of  merit  (e.g.,  excludes  this  example) 

□  Realtime  computing  practice  tends  to  express  and 
handle  softer  time  constraints  even  less  effectively — 

not  on  a  time  constraint  basis  at  all,  but  instead  in  dis¬ 
parate,  ad  hoc,  imprecise  ways 


□  Softer  time  constraints  necessitate  correspondingly 
‘softer’ — i.e.,  non-binary — collective  timeliness  rela¬ 
tionships 

□  Using  the  previous  time  constraint  example, 

the  collective  timeliness  relationship  could  be  one 
which  (as  a  scheduling  criterion)  increases  the  num¬ 
ber  of  completions  in  the  optimal  region — e.g., 

*  the  sum  (or  mean)  of 

*  weighted  lateness  =  (completion  time  -  deadline) 
-f  k  (completion  time  -  predeadline) 


□  Some  softer  collective  timeliness  relationships  are 
routinely  handled  in  terms  of  lateness  with  classical 
scheduling  theory 

while  others  necessitate  more  expressive  time  con¬ 
straint  relationships 

□  Realtime  computing  practice  tends  to  express  and 
handle  softer  collective  timeliness  less  effectively — 

not  on  a  time  constraint  basis  at  all,  but  instead  in  dis¬ 
parate,  ad  hoc,  imprecise  ways 
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Oftwi  Temporal  Aoeoptability  to  Not  Binary 

□  Softer  collective  timeliness  necessitates  correspond¬ 
ingly  “softer* — i.e.,  non-binary — temporal  acceptabili¬ 
ty  relationships 

□  The  degree  of  collective  temporal  acceptability  might 
be  based  on 

*  collective  timeliness  alone— e.g.,  acceptable  only 

a  only  above  one  lower  botmd  under  certain  cir¬ 
cumstances,  and  above  a  different  lower  bound 
under  other  circumstances 

*  both  individual  and  collective  timeliness — e.g.,  ac¬ 
ceptable  to  the  degree  that 

a  some 

«  total  number  of 
»  or  specific  individual 
computations 

a  are  late  by  a  certain  amount 
a  under  certain  conditions 

□  Realtime  computing  practice  tends  to  express  and 
handle  softer  temporal  acceptability  less  effectively — 

not  on  a  time  constraint  basis  at  all,  but  instead  in  dis¬ 
parate,  ad  hoc,  imprecise  ways 
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TimalinMS  la  Scatoabto  To  The  Extant  That 
Tha  Choica  Of  Tha  Thraa  Ralationshipa  to  Unrastricted 

□  The  timeliness  of  an  os  architecture  is  the  degree  to 
which  it  supports  the  timehness  of 

a  application 

*  its  own 
computations 

□  The  timeliness  of  an  OS  architecture  is  scaleable  to  the 
extent  that  its  instantiations  can  accommodate 

arbitrary  (i.e.,  an  unrestricted  choice  oO  relationships 
for 

*  individual  time  constraints 
a  collective  timeliness 

*  temporal  acceptability 
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Scaleable  Time  Conairaint  Ralatlonahipa  Are 
Temporal  Merit  As  A  Function  Of  Compiation  Tima 


Tha  Traditior»l  Realtime  Computing  Interpretation 
Of  A  Deadline  to  A  Downward  Step  Function 


i 


□  The  time  constraint  relationship  can  be  made  arbi¬ 
trary  by  thinking  explicitly  of 

individual  temporal  merit  being  any  function  fr  of  the 
computation’s  completion  time  t 


MMttsIrtt)  0 

CompuMtan  CoiiipMOT'nm  t 


□  The  classical  deadline  function’s  merit  of  lateness  is 
then  depicted  as 


Cowytrtrtlon  Cowpitlon  ’nttw 

*  a  line 

*  with  slope  1 

*  having  a  range  of  { -  deadline,  -f  «•} 

*  crossing  the  X  axis  at  the  deadline  time  (becoming 
tardiness) 


□  The  traditional  realtime  computing  interpretation  of  a 
deadline,  when  viewed  as  a  time  constraint  function, 
is 


Ip- . . 

UnH  0 - - 

—  oe 

CcmpulWon  Co«rpl»l»n  Tlwn  CompuMonCcmpMonTiint 

*  a  binary-valued,  downward  step  function 

m  completing  the  computation  anytime  between 
its  release  (X  s  o)  and  deadline  times  is  imiform- 
ly  timely 

■  and  otherwise  is  uniformly  untimely 

*  the  smaller  of  the  two  binary  merit  values  may  be 

a  o:  zero  merit  is  attained  for  completing  the  com¬ 
putation  after  its  deadline 

a  -•<>:  a  large  merit  penalty  is  incurred  for  com¬ 
pleting  the  computation  after  its  deadline 
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In  RmI  Systwns  Vary  Often  Th«  Tim*  Constraint 
Is  Naittiar  Unsar  Nor  Binary 


Tim*  Constraints  Ar*  Scalasbis  To  Tha  Extent  That 
They  Ar*  Defined  By  Ai1>itrary  Function* 


□  Both  the  classical  and  traditional  realtime  computing 
interpretations  of  a  deadline  are  often  poor  approxi¬ 
mations  to  actual  realtime  constraints 


□  Time  constraints  are  scaleable  to  the  extent  that  they 
S  are  arbitrary  functions 
*  with  arbitrary  merit  ranges 


□  There  are  many  cases  in  realtime  applications  where 

S  there  is  some  diminished  merit  attained  for  com¬ 
pleting  the  computation  within  an  allowable  tardi¬ 
ness  period 

*  the  merit  is  not  constant  prior  to  the  ‘deadline' 

*  the  penalty  is  not  constant  after  the  ‘deadline’ 

*  the  merit  measure  and  range  are  application-spe¬ 
cific 
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CompuWlsn  CsmpUtan  TkTM 


CompuMion  ConipUtiMi  Thm 
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ComputsUon  ConpUtian  Tim 


□  Deadlines  are  not  a  general  mechanism  for  expressing 
scaleable  realtime  time  constraints 


□  The  merit  measure  is  application-specific  and  defined 
system-wide 

□  The  computation  completion  time  axis  is  the  one  the 
scheduler  uses — ^it  may  be 

*  physical  0 

a  absolute  (‘calendar/wall  clock’)  time 

■  relative  to  (since)  some  past  event 

*  logical — a  number  which  monotonically  increases, 
but  not  necessarily  at  regular  intervals 

□  The  origin  of  the  time  constraint  function  axes  is  the 

current  time  (value  of  the  system  clock)  w 

□  Time  constr  iint  functions  are 

*  derived  by  the  programmers  directly  from  the  re¬ 
quirements  and  behavior  of  the  realtime  computa¬ 
tion 

(usually  an  application  activity) 

*  subject  to  a  system-wide  erg’neering  process  ® 

(just  as  are  assignments  of  classical  priorities) 
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Collectiva  Timalinass  is  Tha  Scheduling  Criterion 

□  A  realtime  scheduler  considers  all  released  time  con¬ 
straints  between  the  current  time  and  its  horizon 


m 

u 

CompuMkn  CcmpMcn  Thm 

□  It  assigns  the  estimated  execution  completion  times — 
and  consequently  the 

*  initiation  times 

*  partial  ordering 
for  those  computations 

□  It  employs  an  algorithm  chosen  to  optimize  the  collec¬ 
tive  timeliness — i.e.,  scheduling — criterion 

(perhaps  taking  into  account  other  factors  such  as  de¬ 
pendencies) 

□  In  general,  collective  timeliness  is  not  necessarily 
unanimously  optimum  with  respect  to  the  individu^ 
computations'  timeliness — 

the  traditional  "hard*  realtime  computing  criterion  of 
all  computations  meeting  their  deadlines  is  a  popular 
exception 
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Collective  Timeliness  Is  Scaleable  To  The  Extent  That 
It  I*  Defined  By  Arbitrary  Functions 

□  The  collective  timeliness  relationship  can  be  made  ar¬ 
bitrary  by  thinking  explicitly  of 

collective  timeliness  being  any  function  fc  of  the  indi¬ 
viduals’  timeliness 

□  The  common  classical  collective  timeliness  func¬ 
tions— e.g., 

*■  the  number  of  tardy  completions 
*  the  mean  lateness 

can  readily  be  generalized  in  terms  of  arbitrary  indi¬ 
vidual  merit  measures 

□  The  collective  timeliness  function  Ic  for  traditional  re¬ 
altime  computing’s  interpretation  of  hard  deadline 
time  constraints  (assuming  their  usual  range  of  10, 11) 
is 

fc*:  the  product  of  the  individual  merits 
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Temporal  Aoceptabilily  ie  Scaleable  To  The  Extent  The! 
It  Ie  Defined  By  Arbitrary  Functions 


Highly  Scaleable  Timeiinees  Ie  Fscilitatad  By 
Scheduler  Policy/Mechanism  Separation 


□  The  collective  temporal  acceptability  relationship  can 
be  made  arbitrary  by  thinking  explicitly  of 

collective  temporal  acceptability  being  uay  function  /* 
of 

*  the  individuals’  and  collective  timeliness 

*  other  parameters,  such  as  system  state 

□  The  collective  temporal  acceptability  function  f«  for 
classical  deadline  time  constraints  is  often  null — i.e., 

□  The  collective  temporal  acceptability  function  f«  for 
traditional  realtime  computing’s  interpretation  of 
hard  deadline  time  constraints  is 

fA(.  ■.)  =  fc*(-..) 


□  Highly  scaleable  timeliness  is  facilitated  by  a  form  of 
functional  scaleabUity — scheduler  ooliry/mechanism 
separation 

because  the  scheduling  policy 

employs  the  time  constraint  functions 

*  to  optimize  the  collective  timeliness  function 

*  so  as  to  meet  the  collective  temporal  acceptability 
function  criterion 

□  'The  extent  to  which  timeliness  is  achieved  dynamical¬ 
ly — i.e., 

*  by  the  OS  at  execution  time 

*  rather  than  by  the  programmers  at  system  config¬ 
uration  time 

affects  the  impact  of  scaleable  timeliness  on  the  sched¬ 
uler 


Kb  14.  iMia* 


d|i!g  i|t|a  I 


An  OS  Architocture  Can  Ba  Scaleable 
In  Many  Different  Respects 

□  Such  an  OS  architecture  must  be  highly  scaleable  with 
respect  to  a  number  of  its  attributes — 

most  importantly  including 

*  functionality 

*  performance 

*  timeliness 
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I  *  decentralization 

♦  fault  tolerance 


» 


Perfect  Timeliness  Is  An  Ideal  But  Generally  Unrealistic 

□  The  ideal  case  of  perfect  collective  temporal  merit 

*  every  computation  always  completing  execution 

*  at  an  optimum  time 
is  unrealistic  in  general 

□  Even  though  the  traditional  ‘hard”  reahirae  cases  are 
intended — and  commonly  imagined — to  achieve  this 
ideal, 

S  physical  laws  (especially  in  decentralized  systems) 

*  the  intrinsic  nature  of  the  applications  (especially 
at  mission  management  levels) 

generally  make  it 

*  non-cost-effective 

*  or  even  impossible 

□  Thus,  the  timeliness  (temporal  merit)  of  computations 
and  systems  is  not  necessarily  assured  and  known 

*  accurately 
*■  in  advance 

*  or  even  at  all 
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Pradietability  Is  TIm  Extant  That 
T)malinaaa  Can  Ba  Eatimatad  AccaptaMy 


Tha  Dagraa  Cf  Pradictability  Dapanda  On 
ParanMtar  Knowladga  And  Schadular  Bahavior 


□  We  consider  predictability  to  be  the  degree  to  which 
timeliness  can  be  estimated  (in  advance)  acceptably 

□  Predictability  applies  to  any  level  of  a  system — e.g., 

^  individual  computations  of  an  application  or  os 
a  sets  of  computations  of  an  application  or  OS 

*  individual  functions  or  services  of  an  OS 
an  OS  as  a  whole 

*  a  system  as  a  whole 

□  The  usual  practice  is  for  the  degree  of  predictability  to 
be 

*  specifled  prior  to  system  execution  time 

*  attained  at  system  design  time 

*  evaluated  after  system  execution  time 

□  But  it  is  possible  for  the  degree  of  predictability  to  be 

*  specified 

*  attained 

*  evaluated 

dynamically  (at  execution  time) 


□  The  degree  of  predictability  of  a  computation  depends 
on 

how  well  known  are  all  parameter  values  of  9 

m  the  computation — e.g., 

0  arrival  time 
»  execution  duration 

a  and  its  future  execution  environment — e.g., 

«  resource  dependencies  on 

«  conflicts  with  ^ 

o^her  computations 

*  how  well  controlled  is  the  time  evolution  of  the 
processes  which  govern  the  computation’s  timeli¬ 
ness 


a  particularly  the  scheduler — 

which  must  be  responsible  for  scheduling  all 
physical  and  logical  resources,  not  just  proces¬ 
sor  cycles,  in  systems  needing  high 
«  performance  of  computation  completion 
«  timehness 
«  predictability 

a  e.g.,  chaotic  regimes  are  a  significant  risk  in 
highly  decentralized  schedulers,  especially  real¬ 
time  ones 


The  OegreeOf  Predictability  Is  Established  AocordingTo 
ITie  Specific  Interpretation  Of  ‘‘Acceptably’’ 

□  The  degree  of  predictability  is  then  established  ac¬ 
cording  to  the  application-specific  interpretation  of 
“acceptably" — 

e.g.,  the  estimate  may  be  intended  to  be 

*  extremely  precise  in  certain  instances — e.g., 
a  for  certain  critical  computations  or  services 
a  under  certain  critical  conditions 

at  the  expense  of  being  less  so  in  the  remainder 

*  versus  being 
a  less 

a  but  equally 
precise  in  every  instance 


Determirasm  Is  The  Maximum  Case  Of  Predictability 

□  Deterministic  computation  in  the  realtime  context  lit¬ 
erally  means  that  a  computation’s,  or  set  of  computa¬ 
tions’,  timeliness  is  known 

*  absolutely 
in  advance 

i.e..  there  is  no  uncortaintv  about  anything  which 
could  affect  its  timehness 

(at  least  barring  faults,  and  preferably  within  accept¬ 
able  fault  coverage  premises) 


□  A  computation  being  deterministic  does  not  imply 
that  its  timehness 

a  individually 
*  BS  a  member  of  a  set 
is  acceptable,  only  that  it  is  known 

(although  the  point  of  deterministic  systems  is  for 
them  to  be  known  to  have  acceptable  collective  timeli¬ 
ness) 
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Datarmlnism  Can  Only  Bt  Approachad  In  Practice 

□  There  are  very  few  actual  realtime  applications  and 
systems  which  (inherently  or  forcibly)  meet  this  deter¬ 
minism  criterion  of  absolute  timeliness  certainty 


□  Most  are  subject  to  some  inevitable  dynamic  fluctua¬ 
tions  and  variabilities  of 

*  computation 

*  commimication 
timing,  due  to  factors  such  as 

*  input  data  arrivals 

*  resource  dependencies  and  conflicts 
*■  overloads 

*  hardware  and  software  exceptions 

(not  to  mention  faults,  errors,  and  failures  outside  the 
presumed  coverage) 
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Larger,  More  Decentralized  Realtime  Systems 
Are  Generally  Non^tochastically  Non*Deterministic 

□  The  computation  and  execution  context  parameters  of 
many  realtime  systems, 

especially  larger,  more  complex,  more  decentralized 
ones, 

are  often  too  asynchronous — i.e., 

*  intermittent 

*  irregular 

*  interdependent 

to  have  known,  or  computationally  tractable,  proba¬ 
bility  distribution  functions 

□  Thus,  these  realtime  systems  must  be  treated  as  non- 
stochastically  non-deterministic 

*  for  which  the  scheduling  technology  is  still  in  its 
infancy 

*  although  non-realtime 
a  algorithms 

a  languages 

a  noodels  (e.g.,  Petri  Nets) 

commonly  take  advantage — for  simplicity— of 
making  non-stochastically  non-deterministic  deci¬ 
sions, 

as  do  an  increasing  number  of  realtime  algorithms 
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Predictability  la  Often  Probabilistic 

□  When  the  parameters  of  a  computation  and  its  future 
execution  environment  are  known  in  the  form  of  ran¬ 
dom  variables, 

BO  that  their  uncertainty  is  characterized  by  probabil¬ 
ity  distribution  functions 

A  the  computation's  timeliness  may  be  amenable  to 
stochastic  analysis 

*  e.g.,  the  probabilities  of 

a  execution  completion  at  different  times 
a  corresponding  temporal  merit  values 
can  be  derived  for  certain  situations 

□  But,  as  with  deterministic  scheduling,  many  of  the 
most  interesting  cases  are 

A  either  known  to  be  analytically  intractable 

*  or  still  defy  explicit  solution 

□  The  contexts,  and  thus  approaches,  of  stochastic 
scheduling  are  predominately  oriented  toward  non-re¬ 
altime  objectives,  such  as  mcdcespan  or  flowtime 
(throughput  measures) 

*  which  are  analytically  and  computationally  easier 
than  stochastic  scheduling  to  meet  due  times 

*  and  for  which  there  is  greater  application  demand 
than  from  the  realtime  community 
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Pradictability  Is  Most  Commonly  Expressed  As 
A  Least  Upper  Bound  On  Timeliness 

□  Predictability  may  be  expressed  in  a  variety  of  ways — 
e.g., 

S  an  assured  upper  bound — the  most  common  way 

(a  lesser  or  least  upper  bound,  since  any  system’s 
timeliness  could  be  said  to  be  predictable  by  the 
choice  of  one  high  enough) 

S  or  a  probability  distribution  function  of  timeliness 
values 

*  or  in  terms  of  discontinuous  rules  which  relate 
various  execution  contexts  to 

m  estimated 
a  bounded 
a  or  even  specific 
timeliness  values — 

those  contexts  being  ones  which  are  most 
a  likely 
a  important 

a  or  just  readily  relatable  to  timeliness  estima¬ 
tions 
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Pradictability  EstiiiiatM  Can  Ba  Mada 
By  A  Varialy  Of  Tachniquaa 

□  Predictability  estimates  may  be  made  on  the  basis  of 

*  formal  analysis 

*  simulation 

*  code  examination 

*■  empirical  measurrment 

□  Predictability  evaluation 

*  is  usually  performed  by  empirical  measurement 

a  but  in  extreme— e.g.,  asymmtotically  determinis¬ 
tic-cases 

■  is  unnecessary 

a  because  attainment  of  the  specification  is  as¬ 
sured 

(e.g.,  through  formal  synthesis) 
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OS  Pradictabilily  Is  Traditionally  Focusad  On 
Timalinoss  Of  Initiating  Computatione 

□  The  realtime  context  and  resulting  perspective  that 
led  conunercial  realtime  os  suppliers  and  users  to  tra¬ 
ditionally  consider 

*  performance  of  an  OS  per  se 

in  terms  of  the  time  to  initiate  computations 

*  and  acceptable  timeUness  of  computation  comple¬ 
tions 

to  be  primarily  an 

■  d  priori 

■  application  programmer  responsibibty 

also  implies  that  predictability  has  been  considered 
likewise 

□  Thus,  OS  predictability  has  been 

*  specified 

*  attained 

*  evaluated 
predominately 

*  statically — as  a  design  and  implementation  issue 

*  for  the  OS  as  a  whole  rather  than  for  individual 
m  functions  or  services 

■  OS  or  application  computations 
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OS  Predictability  la  Scalaabla  To  The  Extent  That 
It  Ranges  From  No  Need  To  Determinism 

□  The  predictability  of  an  OS  architecture  is  scaleable  to 
the  extent  that  its  instantiations  can  accommodate 

*  timeliness  estimate  acceptability  ranging 

a  from  no  need  for  OS-provided  predictability 
»  either  no  predictability  is  needed 
«  or  all  needed  predictability  is  the  responsi¬ 
bility  of  the  os  clients  (e.g.,  application  pro¬ 
grammers) 

and  any  OS  (un)predictabiUty  is  irrelevant 

•  to  asymmtotically  deterministic  OS-provided 
predict  abiUty, 

imder  sufTiciently  certain  conditions 
S  at  granularities  ranging 

a  from  individual  system  and  application  compu¬ 
tations 

a  to  individual  OS  functions  or  services 
a  to  the  whole  08 

□  This  scaleable  predictability  may  be 

14  static — which  is  less  scaleability 

*  dynamic — which  is  more  scaleability 

□  We  neither  quantify  nor  weight  these  factors  here 
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Very  High  Predictability  Is  Usually  Attained 
With  Extraordinary  Concepts  and  Techniques 


□  Ubiquitous  approaches  for  attaining  very  high  de¬ 
grees  of  system  and  OS  predictability  (asynuntoticeJIy  _ 

deterministic)  " 

of  computation  completion  timeliness 

*  are  baaed  on  extraordinary  concepts  and  tech¬ 
niques— e.g., 

m  globally  and  statically  synchronous 
4  computation 

*  i/o  ^ 

■  certain  formal  design  and  validation  methods 

that  impose  extremely  high  costs  of  many  kinds, 
including  intolerance  to  any  forms  of  uncertainties 


□  Some  proponents  of  those  concepts  and  techniques  ar¬ 
gue  that  they  are 

*  not  only  sufficient 

*  but  also  necessary 

for  such  high  degrees  of  predictability 


and  security  in  this  respect 
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Highttf  Predictability  Does  Not  Nacaaaarily  Imply 
That  All  Lower  Predictability  Needs  Can  Be  Mel 


Scale  able  Predictability  Is  About  Uncertainties  And 
Is  Affected  By  Scaleable  Timeliness  And  Functionality 


□  These  concepts  and  techniques  for  high  predictability 
are  special  cases 

which  do  not  scale  down  to 

specifying,  attaining,  and  evaluating 
*  specific  or  even  non-specific 
lesser  degrees  of  predictability 


□  This  implies  that  high  scaleability  of  predictabilitv 

*  either  must  be  achieved  by  new  concepts  and  tech 
niques  which  do  scale  well 

*  or  is  inherently  limited 


□  Predictabil’tv  is  about  dealing  with  uncertainties,  so 
scaleability  of  predictability  must  accommodate 

*  varied  types  and  amounts  of  uncertainties 
m  statically 

■  dynamically 

S  together  with  varied  trade-offs  among 
m  predictability 
a  other  attributes 
B  and  their  costs 

□  High  predictability  of  computation  completion  timeli¬ 
ness 

*  depends  primarily  on  the  scheduler 

*  and  thus  the  timeliness  framework 

so  scaleability  of  predictability  is  strongly  affected  by 
the  scaleability  of 

*  timeliness 

*  functionality  (particularly  policy/mechanism  sep¬ 
aration) 
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An  OS  Architbctura  Can  Ba  Scsiaabla 
In  Many  Diffarant  Raspacts 

□  Such  an  OS  architecture  must  be  highly  scaleable  with 
respect  to  a  number  of  its  attributes — 

most  importantly  including 

*  functionality 

*  performance 

*  timeliness 

*  predictability 

*  fault  tolerance 


Physical  Disparaal  Of  Procaaaora  Is  Dafinad  By  A  Ratio 
Batwaan  Stata  Changa  And  Communication  Rates 

□  Any  particular  pair  of  processors  in  a  system  is  physi¬ 
cally  dispersed  to  a  degree  defined  in  terms  of  the  ratio 
between 

*  the  rate  at  which  a  processor  can  change  state 

a  the  rate  at  which  processor  state  changes  can  be 
conununicated  between  them 

(not,  as  commonly  thought,  simply  to  the  extent  of 
their  separation) 


W  Mass  I*  iMilN* 
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Ttw  Magnitude  Of  Ttiia  Ratio  ia  Hardware  Depertdent 

□  This  state  change/comunication  ratio  is 

*  relatively  small  in  uniform  memory  access  multi¬ 
processors 

somewhat  greater  in  non-uniform  memory  access 
(NUMA)  multiprocessors — 

i.e,  in  which  memory  references  take  place  (typi¬ 
cally  over  a  backplane  bus)  among 

■  uniprocessors  (with  or  without  local  memory) 

■  uniform  memory  access  multiprocessors  (some¬ 
times  called  “clusters'  in  this  context) 

a  global  memory 

*■  much  greater  in  multicomputers — processing  ele¬ 
ments  (processor/memory  pairs)  which  intercom¬ 
municate  by  nressages  over 

m  a  shared  backplane 

■  serial  bus/ring 

■  private  links  (e.g.,  meshes) 

(sometimes  called  NO  Remote  Memory  Access — 
NORMA — architectures) 

*  greatest  in  networks  and  “distributed  systems’ 


Tha  Slate  Changa/Communication  Ratio 
Ganarally  la  Dynamic  Within  Each  Range 


Software  Is  Physicalty  Decentralized  To  The  Degree  Physical  Dispersal  Has  Various  Fundamental  Effects 

That  Processor  Dispersal  Is  Significant  To  It  Which  Are  Significant  To  Dscentralized  Software 


□  Software  computations  are  physically  decentralized  to 
the  degree  that 

the  state  change/comunication  ratio  which  defines  the 
physical  dispersal  of  processors 

*  is  significant  to 

*  i.e.,  must  be  explicitly  recognized  and  accommo¬ 
dated  by 

those  computations  themselves 

□  We  regard  this  significance  as  qualitative  and  do  not 
quantify  it 

□  Physical  centralization  is  one  end  point  in  the  dimen¬ 
sion  of  physical  decentralization 

□  (Software  is  also  logically  decentralized  to  different 
degrees,  which  we  will  not  address  here) 


□  The  significance  of  the  processor  state  change/comu¬ 
nication  ratio  is  manifest  in  the  software’s  need  to  ex¬ 
plicitly  recognize  and  accommodate  aspects  of 

locality  of  references  in  space  and  time 
A  the  binding  of  computations’ 

■  code  segments 

■  current  execution  points 
m  data 

to  pro  ;.o,.ors 

♦  the 

■  identities 

a  physical  locations 
of  the  processors 

*  the 

■  magnitudes 
a  uniformity 
a  variability 

of  the  interprocessor  communication  times 

a  whether  memory  (in  the  case  of  multiproces¬ 
sors) 

a  or  i/o  (in  the  cases  of  multicomputers,  networks, 
and  distributed  systems) 
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Only  Nod«-Local  Computations  Ars  Cantralizad 

□  The  only  computations  that  are  centralized  (i.e.,  to 
which  the  processor  state  change/comunication  ratio 
is  insignificant) 

are  those  which  are  entirely  local  to  a  node 

^  either  a  uniprocessor — i.e.,  one  processor/memoiy 
pair 

*  or  multiple  processors  having  negligible  physical 
dispersal — 

i.e.,  a  multiprocessor  with  only  globally  shared, 
uniform  access  time,  memoty 
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In  Multinoda  Computer  Systems 
Some  Computations  Span  Multiple  Nodes 


□  In  computer  systems  that  have  a  multiplicity  of  nodes, 
there  must  be  some  computations  that 

*  span  multiple  nodes — i.e.,  are  trans-node 


NodCj  Node,  Node,  Node, 


S  and  thus  are  necessarily  physically  decentralized 
to  some  degree 

□  The  least  decentralized  that  multinode  computations 
can  be  is  on  a  NUMA  multiprocessor 

S  by  definition 

(i.e.,  diiTerectiating  it  from  a  UMA  multiprocessor, 
which  is  single-node) 

*  its  state  change/comunication  ratio  has  unavoid¬ 
ably  lower  bounded  significance  to  all  trans-node 
computations — 

e.g.,  on  locality  of  code  and  data  references  in 
space  and  time  (and  thus  on  performance  at  least) 
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Computation*  At  Differont  Levels  Of  A  System 
Are  Generally  Decentralized  To  Different  Degrees 

□  Computations  exist  at  diflerent  levels  of  the  system 

*  from  the  applications 

*  down  to  the  OS  and  kernel 

*  (and  even  the  processor  interconnect  hardware 
can  be  thought  of  as  comprising  computations) 

□  The  physical  decentralization  of  computations  can  dif¬ 
fer  at  different  levels  of  the  system 

(physical  decentralization  of  the  processor  intercon¬ 
nect  hardware  is  always  maximum) 


An  OS  Is  Decantralized  To  The  Extent  That 
All  Its  Computations  Are 

□  A  centralized  kernel  or  OS  (or  any  other  level  in  the 
system)  is  one  which  has  only  centralized  computa¬ 
tions  and  services — 

which  implies  that 

*  its  computations  and  services  are  confined  entire¬ 
ly  to  a  single  node 

*  any  accommodation  or  exploitation  of  physical  dis¬ 
persal  must  be  performed  at  one  or  more  higher 
levels  in  the  system 

□  A  kernel  or  an  os  (or  any  other  level  in  the  system)  is 
decentraUzed  to  the  extent  that 

*  each  of  its  computations  and  services 

*  is  decentralized 
and  thus  trans-node 
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A  D«eMitraliz*d  OS  Ganaraily  Is  Not  SuitsMs  For 
Lowsr  Or  Highsr  Physical  Oispsissl  Than  Intsndsd 


An  OS  intsndsd  For  Lowsr  Physical  Oispsrsal 
Gsnsrally  Will  Not  Function  Corrsctly  With  Higher 


□  A  kernel  or  an  OS  (or  any  other  level  in  the  system) 
which  is  decentralized  to  any  given  degree 

will  not  necessarily  be  suitable  for  a  different 

♦  lower 

*  higher 

physical  dispersal  than  it  was  intended  for 


□  A  kernel  or  OS  (or  any  other  level  in  the  system)  in¬ 
tended  for  lower  physical  dispersal  generally  will  not 
function  correctly  with  higher  physical  dispersal, 

due  to  its  lack  of  capability  for  decentralization  (ac¬ 
commodating  effects  of  the  state  change/oomunication 
ratio) — e.g., 

♦  a  centralized  os  generally  will  not  work  for  a  NUMA 
multiprocessor — 

e.g.,  because  its  centralized  virtual  memory  man¬ 
agement  cannot  handle  the  non-locality  of  concur¬ 
rent  references  among  “clusters’ 

*  an  os  for  a  NUMa  multiprocessor  generally  will  not 
work  for  a  “distributed"  (NORMA)  system — 

e.g.,  because  of  the  absence  of  coherent  shared  glo¬ 
bal  state  which  it  depends  on, 

such  as  for  intercomputation  communication  and 
synchronization 
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An  OS  lnt«nd«d  For  Higher  Physical  Dispersal 
Generally  Is  Not  Cost-Effective  With  Lower 

□  A  kernel  or  os  (or  any  other  level  in  the  system)  in¬ 
tended  for  higher  physical  dispersal  generally  is  not 
cost-effective  with  lower  physical  dispersal, 

due  to  the  execution  overhead  of  decentralization  (ac¬ 
commodating  eflects  of  the  state  change/comunication 
ratio)— e.g., 

*  any  multiprocessor  OS  has  unnecessary  overhead 
on  a  uniprocessor — 

e.g.,  because  of  its  locks 

*  a  NUMA  multiprocessor  03  has  even  more  unneces¬ 
sary  overhead  on  a  single-node  machine — 

e.g.,  because  of  its  more  complex  virtual  memory 
management 

*  a  “distributed"  OS  may  have  unnecessary  overhead 
on  a 

■  NUMA  multiprocessor 
a  single-node  machine 

because  of  its  facilities  (e.g.,  for  intercomputation 
communication)  to  overcome  the  absence  of  coher¬ 
ent  shared  global  state 


Decentralization  Is  Scaleable  To  The  Extent  That 
It  Is  Independent  Of  The  Magnitude  Of  Dispersal 

□  The  decentralization  of  a  computation  is  scaleable  to 
the  extent  that  the 

*  the  signiiicance  to  that  computation  of  physical 
dispersal  (the  state  change/comunication  ratio) 

*  is  independent  of  the  magnitude  of  the  physical 
dispersal 

□  The  decentralization  of  an 

*  OS  service 

*  OS 

*  or  any  other  level  in  the  system 

is  scaleable  to  the  extent  that  the  decentralization  of 
each  of  its  computations  is  scaleable 

□  An  OS  (or  any  other  level  in  the  system)  which  has 
maximally  scaleable  decentredization  is  entirely  inde¬ 
pendent  of  physical  dispersal — 

i.e.,  can  operate  correctly  on  any 

*  single-node 

*  multinode 
architecture 

(cf.  delay-insensitive  logic) 
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TtM  DaMntralization  Of  Computations  At  Each  Laval  Ona  End  Of  Tha  Multinoda  Architacture  Spactrum 

Is  A  Fundamsntal  Multinoda  Architactura  Dacision  Is  Highly  Physically  Dacantralizad  At  Eva^  Laval 


□  A  fundamental  multinode  system  architecture  deci¬ 
sion  is  the  degree  of  physical  decentralization 

*  not  just  at  each  level  of  the  system 
^  but  also  for  the  various  computations  at  each  level 


□  One  end  of  the  multinode  system  architecture  spec¬ 
trum  reflects  the  processor  dispersal  up  to  the — thus 
highly  physically  decentralized — application  leveKs) 

to  achieve  efficiency  benefits  from 

*  the  programming  and  execution  structures  of  the 
system  being  relatively  congruent  with  that  of  the 
application— e.g., 

B  M-ary  N-cube  architectures 
B  and  message-based  OS’s 

are  a  good  match  for  the  computational  structure 
of  certain  physical  science  applications 

S'  avoiding  overhead  incurred  by  virtualizing  away 
the  the  physical  dispersal 
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High  Dccmtrallzation  At  All  Lavels  Of  The  System 
Has  Been  Popular  For  Supercomputers  And  Realtime 

□  This  end  of  the  spectrum  is  historically — but  dintin- 
ishingly — the  choice  for  multinode 

*  supercomputers 

*  realtime  computers 

because  the  users  of  each  have  tended  to 

*  trade  oti  oost-effectiveness  for  maximum  perfor¬ 
mance 

*  be  less  concerned  with 
a  legacy  software 

a  costs  of  learning  and  tools  for  decentralization 
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Tha  Other  End  Of  Tha  Multinoda  Architactura  Spactrum 
Is  Highly  Physically  Cantralized  At  Every  Level 

□  The  other  end  of  the  multinode  system  architecture 
spectrum  has 

*  as  many  computations 

S'  as  physically  centralized  as  possible 

□  The  goal  of  this  is  to  minimize  the  impact  of  physical 
dispersal  on  software  costs— e.g.,  by 

a  staying  closer  to  familiar  centralized  program¬ 
ming  techniques  and  tools 

B  preserving  legacy  software 

B  being  independent  of  the  physical  dispersal  as¬ 
pects  of  diflerent  multinode  architectures 

□  The  approach  is 

*  to  create  a  virtual  system  for  the  maximum  num¬ 
ber  of  the  higher  levels 

which  U 

B  as  centralized  as  possible 
a  given  the  processor  physical  dispersal 

*  by  being  highly  decentralized  at 
B  the  minimum  number 

a  of  the  lowest  level(s) 


ibsM  Mas*  I  M  a 


335 


Id  I  9  I  i  1 .1  I 


Id  »  g  lit  a  I 


Lower  Lovots  Can  Craata  A  Virtual  NUMA  Systam 
So  That  Highar  Lavala  Can  Ba  Mora  Cantralizad 


A  Virtual  NUMA  Multiprocaaaor  Can  Ba  Craatad 
By  Tha  Procaasor  Intaroonnact  Or  Kerrtal/OS  Laveis 


□  Trans-node  computations  at  higher  levels  can  be  more 
physically  centralized 

if  the  trans-node  computations  at  one  or  more  levels 
below  it 

*  create  a  virtual  more  centralized  system 

*  by  being  highly  physically  decentralized — e.g., 
providing  a  high  degree  of  node  transparency 


Compuiation^  Compuiation^ 


r 


Computation^ 
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Single  NUMA  Multiprooeuor 


□  The  most  centralized  virtual  machine  possible  in  a 
multinode  system  is  a  NUMA  multiprocessor  at  all  lev¬ 
els 

*  virtualization  cannot  reduce  the  fixed  (or  lower 
bounded)  state  change/comunication  ratios 

*  these  ratios  have  unavoidably  lower  bounded  sig¬ 
nificance  to  all  trans-node  computations — 

e.g.,  locality  of  reference  in  space  and  time 


□  The  lowest  system  level  which  can  create  this  virtual 
NUMA  multiprocessor  is  the  processor  interconnect 
hardware — 

cf.  the  KSR-l,  and  numerous  distributed  shared  memo¬ 
ry  research  projects 

4c  minimizes  multinode  impact  on  all  software  from 
the  os  kernel  up 

4c  requires  innovative,  non-standard,  expensive  pro¬ 
cessor  interconnect  hardware 

□  Virtualization  assistance  may  be  provided  by  the  os 
kernel  to  simplify  the  interconnect  hardware — 

cf.  SGl’s  rumored  forthcoming  multinode  product,  and 
numerous  other  distributed  shared  memory  research 
projects 


□  Given  conventional  processor  interconnect  hardware  ^ 

that  doesn't  virtualize  the  nodes, 

the  kerne]  and  OS  are  the  lowest  levels  which  can  do  so 

*  cf.  the  OSP/Rl  version  of  Mach  3  and  OSF-i  (0SF-i/aD) 
that  provides  NUMA  virtualization  on  Intel's  NOR¬ 
MA  Paragon  hypercube 

♦  (not  to  be  confused  with  the  version  of  Mach  3  that 

CMU  modified  to  run  on  NUMA  multiprocessors)  9 
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A  Virtual  NUMA  Multiprocassor  Is  Often  Desired  For 
Extant  OS  And  Application  Layers 


The  Degree  Of  Decentralization  Need  Not  Change 
Monotonically  By  System  Level 


□  Presently,  it  is  most  frequently  desired  that  multinode 
systems  have  minimum  impact  on  the  extant 

4t  os 

*  as  well  as  applications 

□  This  implies  that  one  or  more  levels  of  computation 
between  the  hardware  and  the  applications  must 

*  be  highly  decentralized 

4c  and  provide  the  desired  degree  of  virtualization 
(e.g.,  a  NUMA  multiprocessor) 

□  Some  of  this  virtualization  may  be  provided  by  inter¬ 
mediate  levels  such  as 

*  a  distributed  execution  environment  (e.g.,  DCE) 

4c  an  object-oriented  execution  environment  (e.g., 
OMA-based) 

□  But  such  intermediate  levels  typically 

4c  do  not  have  direct  access  to  kernel  and  os  level  re¬ 
sources 

A  only  via  conventional  centralized  os  services, 
which  limits  their  degree  of 

a  decentralization 
a  timeliness 


□  The  degree  to  which  computations  are  physically  de¬ 
centralized  need  not  change  monotonically  by  system 
level —  9 


e.g.,  physical  decentralization  is  commonly 
4:  low  at  the  OS  level 

to  allow  the  use  of  extant  node  os's  which  were  not 
intended  to  perform  trans-node  management  of 
resources  other  than  for  networking 

4c  high  at  an  intermediate  distributed  execution  en¬ 
vironment  (e.g.,  DCE)  level 

to  reduce  the  trans-node  resource  management 
obligations  of  the 

a  application  programs  above 
a  node  os's  below 


4c  moderate  at  the  application  software  level 

a  to  reduce  the  trans-node  resource  management 
obligations  (thus  costs)  of  those  programs 

a  while  retaining  ability  to  sufliciently  manage 
and  exploit  the  system's  structure 
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TYi«r«  Ara  NMd*  To  Bypass  Virtualization 
In  Multinoda  Syatama 

□  The  programmers  at  a  physically  centralized  level  oc¬ 
casionally  need  to 

*  bypass  some  virtualization 

4c  and  perhaps  also  employ  some  decentralized  com¬ 
putation 

e.g.. 

*  when  they  desire  to  see  or  control  some  software/ 
jrrdware  binding  for 

a  performance— due  to  node  locality  of  execution 
and  data  access 

a  fault  tolerance — by  partitioning  and  replication 

*  in  the  case  of  certain  service  outages  where 

a  application-spedfic  recourse  must  be  taken 
a  or  the  end-to-end  argument  applies 


OS  Fault  Tolaranca  Is  Scalaabla  To  Tha  Extant  That 
^'odfic  Kinds  And  Dagraes  Can  Ba  Providad 

□  Fta'-H  U)lerance  is  the  extent  to  which  a  system 

*  either  exhibits  a  well-defined  failure  behavior 
when  elements  fail 

*  or  masks  element  failures  (continues  to  provide 
service)  to  its  users 

□  In  most  applications,  temporary  errant  behavior  or 
service  unavailability  is  acceptable; 

in  many  others — especially  realtime  ones 

*  one  or  both  kinds  of  fault  tolerance  are  required 

*  to  various  degrees 

S  under  various  circumstances 

□  An  OS  architecture’s  fault  tolerance  is  scaleable  to  the 
extent  that  its  instantiations  can  exhibit  the  kind  and 
degree  of  fault  tolerance  desired  for  a  particular 

*  system 

*  service 


An  OS  Architacture  Can  Ba  Scataabla 
In  Many  Oiffarant  Raspacts 


□  Such  an  OS  architecture  must  be  highly  scaleable  with 
respect  to  a  number  of  its  attributes — 

most  importantly  including 
*■  functionality 

*  performance 

*  timeliness 

41  predictability 
4  ucceolralization 


stuff  about  how  to  hava  scalaabla  fault  tolerance 

a 


4  computation 
*  circumstance 
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Realtime  Computing  Arose  In  A  Historical  Context 

□  Realtime  computing  as  we  think  of  it  today  arose  in  a 
historical  appUcation  and  hardware  context 

which  definitively  shaped  its  perspective  on  techno¬ 
logical  goals  and  approaches 

□  The  most  salient  characteristics  of  that  context  - 

*  relatively  small,  simple,  centralized  subsystems 
for  low-level  sampled-data  monitoring  and  con¬ 
trol — e.g., 

a  eci4uisiti  jn  and  analysis  of  signals,  snch  as 

0  process  data 
6  lab  instrument  readings 
0  radar 

■  feedback  control  of  sensors  and  actuators  such 
as 

«  manufacturing  and  process  equipment 

«  testers  and  analyzers 

ft  aircraft  flight  surfaces  and  weapons 

»  chronic  insufficiency  of  hardware— especially  pro¬ 
cessing  and  memory — resources 

due  to  restricted 
a  cost 

■  size,  weight,  power 
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The  Realtime  Application  And  Hardware  Context 
Is  Expanding  Dramatically  In  The  1990's 

□  Both  of  these  defining  characteristics  of  the  tradition¬ 
al  realtime  context  are  now  changing  so  much  in  de¬ 
gree  that  they  are  changing  in  kind 

□  An  increasing  number  of  realtime  computing  applica¬ 
tions  are  becoming 

*  larger 

*  more  complex 

*  more  decentralized 

*  higher  level  (strategic) 

*  systems 

□  Computer  hardware,  particularly  microprocessor  exe¬ 
cution  speed  and  memory  size, 

is  growing,  and  dropping  in  cost,  at  an  extremely  fast 
pace 

□  These  expansions  of  the  realtime  application  and 
hardware  context  violate  many  of  the  premises  under¬ 
lying  the  conventional  realtime  computing  mindset 
and  technology 

□  A  new  perspective  and  new  scaleable  resource  man¬ 
agement  technology — a  new  paradigm — is  required 
for  this  broadened  realtime  context 


Asynchronous  Decentralization 
Impacts  The  Nature  Of  Realtime  Resource  Management 

□  Asynchronous  Decentralized  Realtime  Computing 

□  A  New  Paradigm  For  Scaleable  Realtime  Computing 
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Application  Pull  And  Toctinology  Push  Ara  Laading  To 
Incraasing  Physical  And  Logical  Oacaniralization 


Physical  Oisparsal  Of  Procassora  is  Defined  By  A  Ratio 
Between  State  Change  And  Communication  Rates 


□  Decentralized  realtime  computing  is  called  for  by  an 
application  most  frequently  because 

*  application  resources— e.g., 

a  factory  or  plant  machinery 

■  oombat  platforms 

are  inherently  specially  dispersed 

*  survivability,  in  the  sense  of  graceful  degradation 
for  coaticucd  svailsbil’ty  nf  aitnatinn-apecific 
functionality 

m  is  usually  more  cost-effective  by  replication  and 
partitioniixg 

■  than  attempting  physically  centralized  func¬ 
tionality  which  is  infallible  or  indestructible 

□  Decentralized  realtime  computing  is  implied  by  tech¬ 
nology  most  frequently  because 

*  multiple  smaller  processors  are  now  very  often 
more  cost-effective  than  a  single  larger  one 

*  the  high  performance  of  current  processors  com¬ 
pared  to  that  of  memory  subsystems  necessitates 
multicomputers  with  message-passing  over  a 
backplane  bus 

□  Decentralization  may  be  physical  or  logical 


□  Any  particular  pair  of  processors  in  a  system  is  physi¬ 
cally  dispersed  to  a  degree  defined  in  terms  of  the  ratio 
between 

*  the  rate  at  which  a  processor  can  change  state 

*  the  rate  at  which  processor  state  changes  can  be 
communicated  between  them 


(not,  as  commonly 
their  separation) 


thought,  simply  to  the  extent  of 
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The  Magnitude  Of  This  Ratio  Is  Hardware  Dependent 

□  This  state  change/comunication  ratio  is 

*  relatively  small  in  uniform  memory  access  multi¬ 
processors 

*  somewhat  greater  in  non-uniform  memory  access 
(NUMA)  multiprocessors — 

i.e,  in  which  memory  references  take  place  (typi¬ 
cally  over  a  backplane  bus)  among 

a  uniprocessors  (with  or  without  local  memory) 

a  uniform  memory  access  multiprocessors  (some¬ 
times  called  ‘clusters’  in  this  context) 

a  global  memory 

*  much  greater  in  multicomputers — processing  ele¬ 
ments  (processor/memory  pairs)  which  intercom¬ 
municate  by  messages  over 

a  a  shared  backplane 
a  aerial  bus/ring 
a  private  links  (e.g.,  meshes) 

(sometimes  called  NO  Remote  Memory  Access — 
NORMA — architectures) 

^  greatest  in  networks  and  “distributed  systems' 


The  State  Change/Communication  Ratio 
Generally  Is  Dynamic  Within  Each  Range 

□  This  ratio  is  in  general  dynamically  variable  mthin 
each  of  the  above  four  ranges — e.g.,  due  to 

a  communication  queuing  in  the  i/o  interconnect  ar¬ 
chitectures 

*  changes  in  virtual  to  physical  memory  mapping  in 
the  memory  interconnect  architectures 
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Softwar*  la  Phyaically  Dacantralized  To  Tha  Oagraa 
That  Procaaaor  Diaparaal  la  Significant  To  It 


Physical  Oisparsal  Has  Various  Fundamental  Effects 
Which  Are  Significant  To  Decentraiizad  Software 


□  Software  computations  are  physiccdly  decentralized  to 
the  degree  that 

the  state  change/comunication  ratio  which  defines  the 
physical  dispersal  of  processors 

*  is  significant  to 

*■  i.e.,  must  be  explicitly  recognized  and  accommo¬ 
dated  by 

those  computations  themselves 

□  We  regard  this  significance  as  qualitative  and  do  not 
quantify  it 

□  Physical  centralization  is  one  end  point  in  the  dimen¬ 
sion  of  physical  decentralization 


(Software  is  also  logically  decentralized  to  different 
degrees,  which  we  will  address  later) 


□  The  significance  of  the  processor  state  change/comu¬ 
nication  ratio  is  manifest  in  the  software's  need  to  ex¬ 
plicitly  recognize  and  accommodate  aspects  of 

locality  of  references  in  space  and  time 

*  the  binding  of  computations' 

■  code  segments 

a  current  execution  points 
a  data 
to  processors 

*  the 

a  identities 
a  physical  locations 
of  the  processors 

*  the 

a  magnitudes 
a  uniformity 
a  variability 

of  the  interprocessor  communication  times 

a  whether  memory  (in  the  case  of  multiproces¬ 
sors) 

a  or  i/o  (in  the  cases  of  multicomputers,  networks, 
and  distributed  systems) 
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Only  Nodw-Local  Computations  Ara  Centralized 

□  The  only  computations  that  are  centralized  (i.e.,  to 
which  the  processor  state  change/comunication  ratio 
is  insignificant) 

are  those  which  are  entirely  local  to  a  node 

S:  either  a  uniprocessor — i.e.,  one  processor/memory 
pair 

*  or  multiple  processors  having  negligible  physical 
dispersal — 

i.e.,  a  multiprocessor  with  only  globally  shared, 
uniform  access  time,  memory 
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In  Multinode  Computer  Systems 
Some  Computations  Span  Multiple  Node« 


□  In  computer  systems  that  have  a  multip>..fty  of  nudes, 
there  must  be  some  computations  that 

♦  span  multiple  nodes — i.e.,  are  trans-node 


NodCi  Node,  Node,  Node, 


*  and  thus  are  necessarily  physically  decentralized 
to  some  degree 

□  The  least  decentraUzed  that  multinode  computations 
can  be  is  on  a  NUMA  multiprocessor 

*  by  definition 

(i.e.,  differentiating  it  from  a  UMA  multiprocessor, 
which  is  single-node) 

*  its  state  change/comunication  ratio  has  unavoid¬ 
ably  lower  bounded  significance  to  all  trans-node 
computations — 

e,g.,  on  locality  of  code  and  data  references  in 
space  and  time  (and  thus  on  performance  at  least) 
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ComputatioiM  At  Oiftarant  Levels  Of  A  System 
Are  Generally  Decentralized  To  Different  (^ireee 

□  Computations  exist  at  diflerent  levels  of  the  qrstem 

*  from  the  applications 

*  down  to  the  OS  and  kernel 

*  (and  even  the  processor  interconnect  hardware 
can  be  thought  of  as  comprising  computations) 

□  The  physical  decentralization  of  computations  can  dif¬ 
fer  at  different  levels  of  the  sy«*em 

(physical  decentralization  of  the  processor  intercon¬ 
nect  hardware  is  always  maximum) 


An  OS  Is  Dacentratizad  To  The  Extent  That 
All  Its  Computations  Are 

□  A  centralized  kernel  or  os  (or  any  other  level  in  the 
system)  is  one  which  has  only  centralized  computa¬ 
tions  and  services — 

which  implies  that 

^  its  computations  and  services  are  confined  entire¬ 
ly  to  a  single  node 

*  any  accommodation  or  exploitation  of  physical  dis¬ 
persal  must  be  performed  at  one  or  more  higher 
levels  in  the  system 

□  A  kernel  or  an  os  (or  any  other  level  in  the  system)  is 
decentralized  to  the  extent  that 

*  each  of  its  computations  and  services 

*  is  decentralized 
and  thus  trans-nodc 
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A  Decentralizod  OS  Generally  Is  Not  Suitable  For 
Lower  Or  Higher  Physical  Dispersal  Than  Intended 

□  A  kernel  or  an  OS  (or  any  other  level  in  the  system) 
which  is  decentralized  to  any  given  degree 

will  not  necessarily  be  suitable  for  a  different 

♦  lower 

*  higher 

physical  dispersal  than  it  was  intended  for 


An  OS  Intended  For  Lower  Physical  Dispersal 
Generally  Will  Not  Function  Correctly  With  Higher 

□  A  kernel  or  OS  (or  any  other  level  in  the  system)  in¬ 
tended  for  lower  physical  dispersal  generally  will  not 
function  correctly  with  higher  physical  dispersal, 

due  to  its  lack  of  capability  for  decentralization  (ac¬ 
commodating  effects  of  the  state  cliange/comunication 
ratio) — e.g., 

♦  a  centralized  OS  generally  will  not  work  for  a  NXIMa 
multiprocessor — 

e.g.,  because  its  centralized  virtual  memory  man¬ 
agement  cannot  handle  the  non-locality  of  concur¬ 
rent  references  among  “clusters" 

a  an  OS  for  a  NUMA  multiprocessor  generally  will  not 
work  for  a  “distributed"  (NORMA)  system — 

e.g.,  because  of  the  absence  of  coherent  shared  glo¬ 
bal  state  which  it  depends  on, 

such  as  for  interoomputation  communication  and 
q'nchronization 
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An  OS  Intended  For  Higher  Physical  Dispersal 
Generally  Is  Not  Cost-Effective  With  Lower 


Decentralization  Is  Scaleable  Tc  The  Extent  Tha* 
It  Is  Independent  Of  The  Magnitude  Of  Dispersal 


□  A  kernel  or  OS  (or  any  other  level  in  the  system)  in- 
tended  for  higher  physical  dispersal  generally  is  not 
w  cost-effective  with  lower  physical  dispersal, 

due  to  the  execution  overhead  of  decentralization  (ac¬ 
commodating  effects  of  the  state  change/comunication 
ratio)— e.g., 

♦  any  multiprocessor  os  has  unnecessary  overhead 
or  1  uniprocessor — 

^  e.g.,  because  of  its  locks 

a  a  NUMA  multiprocessor  os  has  even  more  unneces¬ 
sary  overhead  on  a  single-node  machine — 

e.g.,  because  of  its  more  complex  virtual  memory 
management 

*  a  “distributed"  OS  may  have  unnecessary  overhead 
on  a 

9  a  NUMA  multiprocessor 

■  ■single -node  machine 

because  of  its  facilities  (e.g.,  for  intercomputation 
communication)  to  overcome  the  absence  of  coher¬ 
ent  shared  global  state 


□  The  decentralization  of  a  computation  is  scaleable  Ui 
the  extent  that  the 

*  the  significance  to  that  computation  of  physical 
dispersal  (the  state  change/comunication  ratio) 

*  is  independent  of  the  magnitude  of  the  physical 
dispersal 

□  The  decentralization  of  an 

»  os  service 

*  OS 

*  or  any  other  level  in  the  system 

is  scaleable  to  the  extent  that  the  decentralization  of 
each  of  its  computations  is  scaleable 

□  An  OS  (or  any  other  level  in  the  system)  which  has 
maximally  scaleable  decentralization  is  entirely  inde¬ 
pendent  of  physical  dispersal — 

i.e.,  can  operate  correctly  on  any 

*  single-node 

*  multinode 
architecture 

(cf.  delay-insensitive  logic) 
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Th«  Dacentralization  Of  Computations  At  Each  Level 
Is  A  Fundamental  Multinode  Architecture  Decision 


□  A  fundamental  multinode  system  architecture  deci 
sion  is  the  degree  of  physical  decentralization 

*  not  just  at  each  level  of  the  system 


♦ 


but  also  for  the  various  computations  at  each  level 
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One  End  Of  The  Multinode  Architecture  Spectrum 
Is  Highly  Physically  Decentralized  At  Every  Level 

□  One  end  of  the  multinode  system  architecture  spec¬ 
trum  reflects  the  processor  dispersal  up  to  the — thus 
highly  physically  decentralized — application  level(s) 

to  achieve  efliciency  benefits  from 

*  the  programming  and  execution  structures  of  the 
system  being  relatively  congruent  with  that  of  the 
application — e.g., 

■  M-ary  N-cube  architectures 

■  and  message-based  OS's 

are  a  good  match  for  the  computational  structure 
of  certain  physical  science  applications 

*  avoiding  overhead  incurred  by  virtualizing  away 
the  the  physical  dispersal 
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High  Decentralization  At  Ali  Levels  Of  The  System 
Has  Been  Popular  For  Supercomputers  And  Realtime 

□  This  end  of  the  spectrum  is  historically— but  dimin- 
ishiogly — the  choice  for  multinode 

4:  supercomputers 
*  realtime  computers 
because  the  users  of  each  have  tended  to 

•*  trade  off  cost-effectiveness  for  maximum  perfor¬ 
mance 

S  be  less  concerned  with 
a  legacy  software 

a  costs  of  learning  and  tools  for  decentralization 
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Ths  Other  End  Of  Ths  Multinods  Architseturs  Spectrum 
is  Highly  Physically  Centralized  At  Every  Level 

□  The  other  end  of  the  multioode  system  architecture 
spectrum  has 

a  as  many  computations 
*  as  physically  centralized  as  possible 

□  The  goal  of  this  is  to  minimize  the  impact  of  physical 
dispersal  on  software  costs — e.g.,  by 

a  staying  closer  to  familiar  centralized  program¬ 
ming  techniques  and  tools 

a  preserving  legacy  soft  ware 

a  being  independent  of  the  physical  dispersal  as¬ 
pects  of  dilTerent  multinode  architectures 


□  The  approach  is 

»  to  create  a  virtual  system  for  the  maximum  num¬ 
ber  of  the  higher  levels  ^ 

which  is 

a  as  centralized  as  possible 
a  given  the  processor  ph>sical  dispersal 
*  by  being  highly  decentralized  at 
a  the  minimum  number 

a  of  the  lowest  level(s)  9 
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Lower  Levels  Can  Create  A  Virtual  NUMA  System 
So  That  Higher  Levels  Can  Be  More  Centralized 


A  Virtual  NUMA  Multiprocessor  Can  Be  Created 
By  The  Processor  Interconnect  Or  Kemel/OS  Levels 


□  Trans-node  computations  at  higher  levels  can  be  more 
physicaUy  centralized 

if  the  trans-node  computations  at  one  or  more  levels 
below  it 

*  create  a  virtual  more  centralized  system 

*  by  being  highly  physically  decentralized — e.g., 
providing  a  high  degree  of  node  transparency 


Cotmpuiation^  Compui»tiom^  Compuloliofi^ 
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Single  NUMA  MuttiprooeeBor 


□  The  most  centralized  virtual  machine  possible  in  a 
multinode  system  is  a  NUMA  multiprocessor  at  all  lev¬ 
els 

*  virtualization  cannot  reduce  the  fixed  (or  lower 
bounded)  state  change/comunication  ratios 

*  these  ratios  have  unavoidably  lower  bounded  sig¬ 
nificance  to  all  trans-node  computations — 

e.g.,  locality  of  reference  in  space  and  time 


□  The  lowest  system  level  which  can  create  this  virtual 
NUMA  multiprocessor  is  the  processor  interconnect 
hardware — 

cf.  the  KSR-i,  and  numerous  distributed  shared  memo¬ 
ry  research  projects 

*■  minimizes  multinode  impact  on  all  software  from 
the  OS  kernel  up 

♦  requires  innovative,  non-standard,  expensive  pro¬ 
cessor  interconnect  hardware 


□  Virtualization  assistance  may  be  provided  by  the  os 
kernel  to  simplify  the  interconnect  hardware — 

cf.  SGl’s  rumored  forthcoming  multinode  product,  and 
numerous  other  distributed  shared  memory  research 
projects 

□  Given  conventional  processor  interconnect  hardware  ^ 

that  doesn’t  virtualize  the  nodes, 

the  kernel  and  os  are  the  lowest  levels  which  can  do  so 

*  cf.  the  OSF/Rl  version  of  Mach  3  and  OSF-l  (OSF-i/AD) 
that  provides  NUMA  virtualization  on  Intel's  NOR 
Ma  Paragon  hypercube 

*  (not  to  be  confused  with  the  version  of  Mach  3  that  0 

CMU  modified  to  run  on  NUMA  multiprocessors) 
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A  Virtual  NUMA  Multiprocasaor  Is  Often  Desired  For 
Extant  OS  And  Application  Layers 


The  Degree  Of  Decentralization  Need  Not  Change 
Monotonically  By  System  Level 


□  Presently,  it  is  most  frequently  desired  that  multinode 
systems  have  minimum  impact  on  the  extant 

♦  OS 


S  qg  well  as  applications 

□  This  implies  that  one  or  more  levels  of  computation 
between  the  hardware  and  the  applications  must 

*  be  highly  decentralized 

S  and  provide  the  desired  degree  of  virtualization 
(e.g.,  a  NUMA  multiprocessor) 


□  Some  of  this  virtualization  may  be  provided  by  in.er- 
mediate  levels  such  as 

*  a  distributed  execution  environment  (e.g.,  DCE) 

*  an  object-oriented  execution  environment  (e.g., 
OMA-based) 

□  But  such  intermediate  levels  typically 

*  do  not  have  direct  access  to  kernel  and  OS  level  re¬ 
sources 

*  only  via  conventional  centralized  OS  services, 
which  limits  their  degree  of 

■  decentralization 

■  timeliness 


□  The  degree  to  which  computations  are  physically  de¬ 
centralized  need  not  change  monotonically  by  system 
level — 

e.g.,  physical  decentralization  is  commonly 

*  low  at  the  OS  level 

to  allow  the  use  of  extant  node  OS’s  which  were  not 
intended  to  perform  trans-node  management  of 
resoiirces  other  than  for  networking 

*  high  at  an  intermediate  distributed  execution  en¬ 
vironment  (e.g.,  DCE)  level 

to  reduce  the  trans-node  resource  management 
obligations  of  the 

■  application  programs  above 

■  node  OS’s  below 

*  moderate  at  the  application  software  level 

m  to  reduce  the  trans-node  resource  management 
obligations  (thus  costs)  of  those  programs 

■  while  retaining  ability  to  sufficiently  manage 
and  exploit  the  system’s  structure 
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Thera  Are  Needs  To  Bypass  Virtualization 
In  Multinode  Systems 


Logical  Decentralization  Relates  To  The  Form  Of 
Multilateral  Activity 


□  The  programmers  at  a  physically  centralized  level  oc¬ 
casionally  need  to 

*  bypass  some  virtualizatic: 

*■  and  perhaps  also  employ  some  decentralized  com¬ 
putation 


e.g., 

♦  when  they  desire  to  see  or  control  some  software/ 
hardware  binding  for 

■  performance — due  to  node  locality  of  execution 
and  data  access 


■  fault  tolerance — by  partitioning  and  replication 
*  in  the  case  of  certain  service  outages  where 

■  application-specific  recourse  must  be  taken 
a  or  the  end-to-end  argument  applies 


□  We  regard  a  computation’s  logical  decentralization  to 
oe  the  degree  to  which  it  is  performed  multilaterally . 
determined  by 

*  consentaneity — the  extent  to  which  the  participat¬ 
ing  entities  must  contribute  to  the  computation 
before  it  is  complete 

*  equipollence — the  functional  parity  of  the  partici¬ 
pating  entities 

*  the  number  of  participating  entities 

□  A  quintessential  form  of  utmost  logical  decentraliza¬ 
tion  is  negotiated  consensus  among  autonomous  peers 

□  Intermediate  forms  of  logical  decentralization  are  ex¬ 
emplified  by 

*  succession — where  all  activities  of  a  computation 
are  performed  for  a  period  of  time  by  one  entity, 
and  then  by  another,  in  some  serial  sequence 

*  partitioning — where  each  entity  performs  a  differ¬ 
ent  activity  of  the  computation,  whether  consecu¬ 
tively  or  concurrently 
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Physical  and  Logical  Oacanltalization  Tand  To  Interact 

□  High  degrees  of  physical  decentralization  at  some  lev¬ 
el 

imply  significant  logical  decentralization  of  at  least 
some  resource  management  at  that  level  is  valuable  or 
essential 

□  High  degrees  of  logical  decentralization  at  some  level, 
such  as  the  application,  in  a  multinode  context 

imply  high  degrees  of  physical  decentralization  is 
present  at  that  or  lower  levels 

□  High  degrees  ofboth  logical  and  physical  decentraliza¬ 
tion  can  easily  have  extremely  complex  dynamics 
which  result  in  chaotic  behavior 

*  avoiding  chaos  while  maintaining  high  perfor¬ 
mance  and  adaptivity  in  such  systems  having 
many  degrees  of  freedom  requires  sophisticated 
control  techniques  which  are  as  yet  nascent 

*  the  very  strong  coupling  sometimes  employed  to 
construct  highly  predictable  realtime  computer 
systems  for  low-level  applications  (e.g.,  MARS) 

■  that  are  both  logically  and  physically  decentral¬ 
ized  to  significant  degrees 

■  at  the  expense  of  adaptability 

is  sufllcient  hut  not  necessary  for  the  avoidance  of 
chaotic  behavior 
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Dccantralizad  RMitim*  Applicstions  Considered  Here 
Are  For  Mission  Management 

□  Some  (including  the  earliest)  modestly  decentralized 
realtime  applications  are 

*  low-level,  synchronous,  sampled  data  communica¬ 
tion,  monitoring,  and  processing 

*  subsystems 

e.g.,  process  control,  sonar  signal  processing 

□  But  the  decentralized  realtime  applications  of  interest 
here  are  those  strategic  ones  now  emerging  for  the 
purpose  of  managing  the  entire  system's  mission — 

e.g.,  coordination  of  multiple  entities  which  are 

*  manufacturing  a  vehicle 

*  repairing  a  damaged  reactor 

*  controlling  air  or  rail  traffic 

*  conducting  a  combat  engagement 

□  Decentralized  mission  management  applications 

*  are  in  addition  to 

*  employ  and  control 

the  cor-stiluent  lower-level  (centralized  and  decentral¬ 
ized)  realtime  subsystems 
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Docantralizsd  R«altim«  Mission  Management  Systems 
Generally  Are  Subject  To  Extraordinary  Uncertainty 


Decentralized  Realtime  Mission  Management  Systems 
Require  High  Dependability 


□  Realtime  mission  management  that  is  highly  physi¬ 
cally  and  logically  decentralized  is  distinctive  in  the 
extent  to  which  it  is  subject  to  extraordinary  execu¬ 
tion-time  uncertainties  at  the  application  levels 

□  The  computations  inevitably 

*  are  asynchronous  (mutually,  globally)— e.g.,  event 
driven,  aperiodic 

*  have  dynamic  dependencies — e.g.,  resource  con¬ 
flicts,  precedence  constraints 

*  are  co-evolving— each  computation’s  behavior  de¬ 
pends  on  that  of  others 

*  often  constitute  an  overload 

*  permit  Uttle  if  any  downtime  for  repairs  or  recon¬ 
figuration 

□  Clomputing  system  physical  distribution  per  se  also 
generally  introduces  considerable  additional  uncer¬ 
tainties — 

e.g.,  variable,  unknown  communication  latencies 


□  The  degree  of  mission  success  is  determined  by  the  ex¬ 
tent  to  which  the  system  can  be  depended  upon  to  pro¬ 
vide  sufficient 

*  timeliness 

*  survivability 

*  safety  and  security 

□  The  dependability  of  lower  layer  subsystems — the 
goal  of  traditional  realtime  computing — may  be 

either  necessary  for  mission-critical  functions 

(e.g.,  digital  avionics  flight  control  keeping  the  air¬ 
craft  aloft) 

*  or  part  of  the  uncertainty  to  be  tolerated  at  the 
system  and  mission  layers 

(e.g.,  communications,  weapons  in  various  states 
of  usability) 

but  it  is  not  sufficient 

(e.g.,  a  flying  aircraft  which  cannot  perform  its  mis¬ 
sion  is  wasting  resources  and  creating  risks) 
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RMolving  RmIHim  D«p«ndability  And  Uncertainty 
Is  Often  Beyond  The  Capability  Of  System  Operators 


□  In  decentralized  mission  management  systems,  a  ma¬ 
jor  challenge  is  simultaneously 

*  achieving  sufficient  dependability 

*  accommodating  execution-time  uncertainties 

□  Virtually  all  realtime  reconciliation  of  uncertainty 
and  dependability  at  the  system  and  mission  levels 
has  historically  b^n  based  solely  on  the  talent  and  ex¬ 
pertise  of  the  system’s  human  operators — e.g., 

*  in  the  control  rooms  of  factories  and  plants 

*  in  the  cockpits  of  aircraft 

□  Increasingly,  the 

complexity  and  pace  of  the  systems’  missions 

*  the  number,  complexity,  and  distribution  of  their 
resources 

cause  cognitive  overload — 

which  requires  that  these  operators  receive  more  sup¬ 
port  in  this  respect  from  the  computing  system  itself 

□  Such  support  is  beginning  to  appear  at  the  application 
levels  in  a  variety  of  non-realtime  computing  systems, 

but  realtime  constraints  require  it  at  the  system  soft¬ 
ware  levels  as  well 
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Best-Effort  Resource  Management 
Involves  Trade-Offs  Of  Risk  And  Situational  Coverage 


□  Best-effort  realtime  resource  management  involves 
trade-offs  of  risk  and  situational  coverage 

S  best-effort  on-line  realtime  scheduling  heuristics 
currently  offer 

a  empirically-based  high  confidence  that  accept¬ 
able  computational  timeliness  will  be  achieved 
over  a  broad  range  of  realistic  conditions 

m  but  no,  or  low,  formal  bounds  on  guaranteed 
best  case  timeliness 

(as  is  necessarily  the  case  for  htiman  operators) 

*  traditional  off-line  ‘hard”  realtime  scheduling  al¬ 
gorithms  provide 

a  formal  guarantees  of  optimum  computational 
timing  under  extremely  restricted  conditions 

a  but  behavior  which  is  unknown,  or  known  to  be 
pathologically  wrong,  outside  those  conditions 

□  Examples  of  realtime  applications  which  seem  to  call 
naturally  for  each  of  these  extremes  come  immediate¬ 
ly  to  mind — 

but  beware  of  the  human  trait  to  miscalculate  risks 

*  people  erroneously  undervalue  the  reduction  of 
risk 

*  in  comparison  to  the  elimination  of  risk 


Decentralized  Realtime  Mission  Management  Requires 
Best-Effort  Realtime  Resource  Management 


□  In  decentralized  mission  management  systems,  both 
the  application  and  computer  system  software  (e.g., 
os,  DCE)  must  make  an  on-line  best  effort  to 

*■  accommodate  dynamic  and  non-deterministic 
a  external  (application  environment) 
m  internal  (system  resource) 
conditions 

*  in  a  robust,  adaptable  way  so  as  to  undertake  that 

■  as  many  as  possible 

■  of  the  most  important  computations 

m  are  as  acceptable,  in  the  time  and  other  do¬ 
mains,  to  the  application  as  possible 

□  Best-effort  resource  management  is  generally  heuris¬ 
tic 

*  the  use  of  heuristics  for  non-realtime  computing  is 

■  common  in  applications  (most  conspicuously  in 
artificial  intelligence,  pattern  recognition) 

■  less  familiar  in  system  software 

*  but  heuristics  have  been  foreign  to  realtime  re¬ 
source  management — focused  on  static  determin¬ 
ism — until  Jensen's  Archons  project  and  Alpha 
kernel  for  mission  management 


Decentralized  Realtime  Mission  Management 
Calls  For  A  New  Paradigm 


□  Conventional  realtime  resource  management  atti¬ 
tudes  and  technology  do  not  permit  such  application - 
specific  trade-offs  between 

*  situational  coverage 

*  optimality  and  predictability 

□  A  new,  more 

*  general 

*  scaleable 

realtime  computing  paradigm  is  needed  to  better  ac¬ 
commodate  asynchronous  decentralized  realtime  com¬ 
puting  systems 

□  Paradigm  shills  are  rather  uncommon  in  computing — 

eg., 

*  parallel  processing 

*  data  flow  models 

and  virtually  unprecedented  in  realtime  computing 
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An  Analogy  Can  Ba  Drawn  From  Gravity 

□  Prior  to  Newton’s  law,  people  felt  that  they  under¬ 
stood  gravity,  as  evidenced  by  the  fact  that  they  could 
take  it  into  account  in  building  acceptably  stable  me¬ 
chanical  constructions 

□  But  some  scientists  were  dissatisfied  with  this  under¬ 
standing  of  gravity  when  it  was  applied  to  larger  scale, 
more  complex,  more  distributed  contexts  such  as  as¬ 
tronomy 

□  Newton’s  clariiication  and  formalization  of  the  ‘force’ 
of  gravity  overcame  essentially  all  of  these  dissatisfac¬ 
tions 

□  But  by  Einstein’s  time,  hardware  technology  (such  as 
instrumentation  range  and  precision)  had  raised  a 
new  set  of  incongruities  between  what  was  then  un¬ 
derstood  as  gravity  and  observable  reality 

□  The  understanding  of  gravity  had  to  be  generalized 
and  elaborated  by  the  law  of  relativity  as  ‘space-time 
curvature"  in  order  to  be  better  applicable  in  larger, 
more  complex,  more  distributed  contexts 

□  (Of  course,  now  we  know  this  remains  a  continuing 
process— c.f.,  ‘gravity  waves’) 
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Naiura  ProvidM  Other  Examples  Of  Paradigm  Shifts 
To  Accommodate  Larger  Scale 

□  Other  examples  of  paradigm  shills  for  scaling  up  arc 
readily  found  in  nature, 

where  higher  animals  are  more  complex  because  they 
are  larger,  rather  than  vice  versa — e.g., 

*  the  principles  of  cell  biology  inherently  limit  the 
scale  of  single  cell  organisms 

*  the  physiology  of  insects  inherently  limits  their 
scale 

□  Examples  are  also  manifest  in  engineering— e.g., 

S  a  small  stream  can  be  bridged  by  placing  a  log 
across  it 

*■  streams  only  a  few  times  wider  than  the  longest 
feasible  single  log  can  be  bridged  by  joining  a 
small  number  of  logs  end-to-end 

*  even  wider  bodies  of  water  require  bridges  based 
on  entirely  different  principles,  such  as  suspen¬ 
sion 
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Asynchronous  Decentralization 
Impacts  The  Nature  Of  Realtime  Resource  Management 

□  Asynchronous  Decentralized  Realtime  Computing 

□  A  New  Paradigm  For  Scaleahle  Realtime  Computing 
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We  Propose  A  New  Model  Of  Realtime  Computing 

□  Because  the  traditional  realtime  viewpoint  and  its  ter¬ 
minology  is  imprecise,  oversimplified,  and  unrealistic, 

it  can — and  does — limit 

*  the  kinds  of  realtime  systems  that  can  be  built 

*  the  cost-effectiveness  of  those  that  are  built 


□  Asynchronous  decentralized  realtime  computer  sys¬ 
tems  for  mission  management  are  a  conspicuous  in¬ 
stance  of  suffering  from  both  these  limitations 

□  We  argue  that  our  new  paradigm  of  realtime  comput¬ 
ing  offers  a  more  systematic,  comprehensive,  and  real¬ 
istic  framework  which  can  help  reduce  such  limita¬ 
tions — 

it  is  based  on 


*  a  new,  more  general  method  for  expressing  time 
constraints  and  scheduling  objectives — 

the  Benefit  Accrual  Model 

♦  new  realtime  scheduUng  objectives  and  policies 
which  accommodate  the  requirement  for  robust 
adaptivity  to  dynamic  system  and  application  con¬ 
ditions — 

Best-Effort  policies 
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Our  N«w  RMitim*  Computing  Paradigm  la  Basad  On 
Tha  Banafit  Accrual  Modal  And  Baat>Effort  Schaduling 


□  Best-EfTort  Scheduling 


A  Raaltima  Computation  Haa  A  Tima  Constraint 


□  We  define  a  realtime  computation  to  be  a  segment  of  a 
computational  entity  (such  a  thread,  task,  or  process) 
subject  to  a  time  constraint 

□  A  time  constraint  is  the  relationship  between 

a  when  a  realtime  computation  completes  execution 
a  the  temporal  merit  of  that  computation 
e.g.,  in  the  classical  deadline  case 
a  completing  before  the  deadline  time  is  better 
a  completing  after  the  deadline  time  is  worse 

□  A  time  constraint  is  manifest  in  the  computation  pro¬ 
gram  as  a  demarcated  region  of  code  whose  execution 
completion  time  is  subject  tc  the  time  constraint — 

e.g.,  the  computation  must  complete  execution  of  the 
region  before  the  deadline  time  arrives 

BEGIN  TC  IDL  =  30  mS) 


RtUtim  Ccnpuaton 


END  TC 

otherwise  it  must  suffer  an  exception  condition 


Timaiiiwss  Is  Ths  Basis  For  Rsaltims  Schaduling 

□  We  consider  the  timeliness — i.e.,  temporal  merit — of 
computations  to  be  the  principle  basis  for 

a  specifying 
a  scheduling 
a  evaluating 

computation  completion  times 

□  In  the  Benefit  Accrual  Model,  timeliness  is  defmed 
with  a  framework  consisting  of  three  relationships 
(e.g.,  functions) 


A  Timsiinass  Framework  Is  Comprised  Of  Three  Parts 

Q  Each  realtime  computation  has  a  time  constraint — 
i.e.,  a  relationship  between 

a  when  the  computation  completes  execution 

a  the  resulting  temporal  merit — timeliness — of  that 
computation 

(e.g.,  for  the  classsical  deadline  time  constraint, 
lateness  s  completion  time  -  deadline) 

Q  A  collective  temporal  merit  relationship  defmes 
a  the  collective  timeliness  of  a  set  of  compulations 

a  in  terms  of  the  individual  timeliness  of  all  its  con¬ 
stituent  computations 

(e.g.,  the  number  of  deadlines  met — i.e.,  with  negative 
lateness) 

Q  Picollective  temporal  acceptability  relationship  defines 

a  the  acceptability — in  an  application-specific  met¬ 
ric 

a  of  the  completion  times — predicted  or  experi¬ 
enced — for  a  set  of  computations 

expressed  in  terms  of  their  individual  or  collective 
timeliness 

a  for  specified  system  and  application  states 
(e.g.,  acceptable  means  always  meeting  all  deadlines) 


IIMk  t>  lfH4  »• 
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TimelinMs  R>r  Classical  Daadlina  Time  Constraints 
Is  In  Terms  Of  Tardiness 

□  The  classical  deadline  time  constraint  (i.e.,  in  schedul¬ 
ing  theory)  employs 

*■  lateness  s  completion  time  -  deadline 
*  or  tardiness  m  positive  lateness 
as  its  individual  measure  of  timeliness 
release  time  deadline 

run  lirrte 


negative 


-positive  (tardiness) 


lateness 


□  The  collective  timeliness  relationship  of  a  set  of  com¬ 
putations  having  classical  deadline  time  constraints  is 
most  frequently  chosen  to  be  one  of  the  following 

*  the  occurrence  or  not  of  at  least  one  tardy  (positive 
lateness)  completion 

*  the  number  of  tardy  completions 

*  the  mean  lateness 

□  Classical  deadline-based  scheduling  theory  often  im¬ 
plicitly  presumes  that 

collective  temporal  acceptability  is  equivalent  to  col¬ 
lective  timeliness 
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The  Traditional  Hard  Deadline  Case  Allows  Only  For 
Binary  Timeliness  And  Acceptability 

□  The  traditional  realtime  computing  interpretation  of 
‘hard’  deadlines  implies  restrictions  of  timeliness  to 

*  a  binary  special  case  of  the  deadline  time  con¬ 
straint — timely  and  untimely 


release  time 


deadline 


run  omo 

.... 

untimely 

m  moy  » 

*  a  binary  collective  timeliness  relationship 

■  untimely:  the  occurrence  of  at  least  one  tardy 
completion 

■  timely;  otherwise 

♦  a  binary  measure  of  collective  temporal  accept¬ 
ability 

a  acceptable:  no  occurrence  of  tardy  completions 
(unanimous  optimum)  under  any  conditions 

a  unacceptable:  the  occurrence  of  at  least  one  tar¬ 
dy  completion  under  any  conditions 

where  the  semantics  of  “unacceptable”  are  specific 
to  the  computation  and  application — e.g., 

a  non-productive 
a  counter-productive 
in  some  way 
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Often  Time  Constraints  Are  Not  Binary 

□  Often  it  is  very  useful  or  necessary  to  have  softer — i.e., 
non-binary — time  constraints 

□  A  common  example  of  such  a  softer  time  constraint: 

if  a  particular  computation  cannot  be  completed  at  a 
time  of  optimal  merit — i.e.,  before  its  “predeadline’ 

★  completing  it  a  little  “tardy”  has  reduced  merit — 
but  is  better  than  not  completing  it  at  all 

*  however,  completing  it  actually  tardy  (after  its 
deadline)  has  negative  merit — i.e.,  is  worse  than 
not  completing  it  at  all 

release  time  "predeadline'  deadline 


run  lime 

ODiimel 

suboptimal 

merit 

negative 

merit 

mer/r  ■ 

□  Some  softer  time  constraints  are  routinely  handled  in 
terms  of  lateness  with  scheduling  theory — 

but  the  linearity  of  lateness  greatly  limits  the  inter¬ 
pretation  of  merit  (e.g.,  excludes  this  example) 

□  Realtime  computing  practice  tends  to  express  and 
handle  softer  time  constraints  even  less  effectively — 

not  on  a  time  constraint  basis  at  all,  but  instead  in  dis¬ 
parate,  ad  hoc,  imprecise  ways 


Often  Collective  Timeliness  Is  Not  Binary 

□  Softer  time  constraints  necessitate  correspondingly 
“softer’ — i.e.,  non-binary— collective  timeliness  rela¬ 
tionships 

□  Using  the  previous  time  constraint  example, 

the  collective  timeliness  relationship  could  be  one 
which  (as  a  scheduling  criterion)  increases  the  num¬ 
ber  of  completions  in  the  optimal  region— e.g., 

*  the  sum  (or  mean)  of 

*  weighted  lateness  =  (completion  time  -  deadline) 
■f  k  (completion  time  -  predeadline) 

□  Some  softer  collective  timeliness  relationships  arc 
routinely  handled  in  terms  of  lateness  with  classical 
scheduling  theory 

while  others  necessitate  more  expressive  time  con¬ 
straint  relationships 

□  Realtime  computing  practice  tends  to  express  and 
handle  softer  collective  timeliness  less  effectively — 

not  on  a  time  constraint  basis  at  all,  but  instead  in  dis¬ 
parate,  ad  hoc,  imprecise  ways 


WH«(I  II.  iWlOfk 
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Oft«n  Temporal  Acceptability  la  Not  Binary 


□  Softer  collective  timeliness  necessitates  correspond¬ 
ingly  “softer" — i.e.,  non-binary — collective  temporal 
acceptability  relationships 

□  The  degree  of  collective  temporal  acceptability  might 
be  based  on 

*  collective  timeliness  alone— e.g.,  acceptable 

a  only  above  one  lower  bound  imder  certain  cir¬ 
cumstances,  and  above  a  different  lower  bound 
under  other  circumstances 

a  to  the  degree  that  it  exceeds  a  lower  bound 

*  both  individual  and  collective  timeliness— e.g.,  ac¬ 
ceptable  to  the  degree  that 

a  some 

«  total  number  of 
«  or  specific  individual 
computations 

a  are  late  by  a  certain  amount 
a  under  certain  conditions 

□  Realtime  computing  practice  tends  to  express  and 
handle  softer  temporal  acceptability  less  effectively — 

not  on  a  time  constraint  basis  at  all,  but  instead  in  dis¬ 
parate,  ad  hoc,  imprecise  ways 


EMU _ 

The  Traditional  Realtime  Computing  Interpretation 
Of  A  Deadline  la  A  Downward  Step  Function 

□  The  traditional  realtime  computing  interpretation  of  a 
deadline,  when  viewed  as  a  time  constraint  function, 
is 


ComputafonConaMionTkiw  ComfuMenCaiTSiltlonTinit 

*  a  binary-valued,  downward  step  function 

a  completing  the  computation  anytime  between 
its  release  (X  s  o)  and  deadline  times  is  uniform¬ 
ly  timely 

a  and  otherwise  is  uniformly  untimely 

*  the  smaller  of  the  two  binary  merit  values  may  be 

a  0:  zero  merit  is  attained  for  completing  the  com¬ 
putation  after  its  deadline 

a  a  large  merit  penalty  is  incurred  for  com¬ 
pleting  the  computation  after  its  deadline 
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A  Computation  Time  Constraint  Relationship  Is 
Temporal  Merit  As  A  Function  Of  Its  Completion  Time 


□  In  the  Benefit  Accrual  Model,  a  computation’s  time 
constraint  relationship — i.e.,  urgency — is  made  arbi¬ 
trary  by  thinking  explicitly  of 

individual  temporal  merit  being  any  function  fr  of  the 
computation’s  completion  time  t 


ConvutaSon  Coinpl*l>o'<  tiim  t 

□  The  classical  deadline  function’s  merit  of  lateness  is 
then  depicted  as 


Consutatftn  ConWtSon  Tmw 

*  a  hne 

*  with  slope  +1 

*  having  a  range  of  I  -  deadline,  +  «<>) 

*  crossing  the  X  axis  at  the  deadline  time  (becoming 
tardiness) 


In  Real  Systems  Very  Often  The  Time  Constraint 
Is  Neither  Linear  Nor  Binary 


□  Both  the  classical  and  traditional  realtime  computing 
interpretations  of  a  deadline  are  often  poor  approxi¬ 
mations  to  actual  realtime  constraints 

□  There  are  many  cases  in  realtime  applications  where 

*  there  is  some  diminished  merit  attained  for  com¬ 
pleting  the  computation  within  an  allowable  tardi¬ 
ness  period 

*  the  merit  is  not  constant  prior  to  the  “deadline" 

*  the  penalty  is  not  constant  after  the  “deadline" 

S  the  merit  measure  and  range  are  application-spe¬ 
cific 


CemputoSon  ComptaSon  Tim  C«nipuMion  Compl'lion  Tm 


ComputsKonConipttSon'nm  CempuMon  Complilloo  Tnt 


□  Deadlines  are  not  a  general  mechanism  for  expressing 
scaleable  realtime  time  constraints 
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In  Th*  Benefit  Accruel  Model 
A  Time  Conetraint  le  Expreeeed  By  A  Benefit  Function 


A  Benefit  Function  ie  Defined  Over  A  Range  Of  Time 


□  The  Benefit  Accrual  Model  expresses  an  individual 
computation's  time  constraint  relationship  in  terms  of 
a  temporal  merit  called  benefit  (B) 


□  Benefit  functions  may  be  unimodal  or  multimodal 


(the  non-linear  optimizations  involved  in  dealing  with 
multimodal  benefit  functions  lead  tis  to  temporarily 
confining  ourselves  here  to  unimodal  ones) 


□  The  benefit  metric  is  application-specific  and  defined 
system-wide 

□  Benefit  fimctions  are 

♦  derived  by  the  programmers  directly  from  the  re¬ 
quirements  and  behavior  of  the  realtime  computa¬ 
tion  (usually  an  application  activity) 

*  subject  to  a  system-wide  engineering  process  (just 
as  are  assignments  of  classical  priorities) 


Sooner  And  Later  Times  Define  The  “Best”  Interval 


□  The  later  time  is  that  after  which  the  benefit  func¬ 
tion  value  is  (monotonically)  non-increasing 

*  thus,  completing  the  realtime  computation  at  or 
after  this  time  is  better 

*  a  benefit  function  always  has  a 

□  The  sooner  time  t,  is  that  after  which  the  benefit  func¬ 
tion  value  is  (monotonically)  decreasing 

*  thus,  completing  the  realtime  computation  at  or 
before  this  time  is  better 

*  a  benefit  function  need  not  have  a  is  <  tr 


□  If  its  value  becomes  zero  or  negative  at  time  t[  i  ts,  a 
benefit  function  has  an  expiration  time 


□  ITie  time  axis  is  il.e  cue  the  scheduler  uses — it  may  be 

a  physical 

■  absolute  (‘calendar/wall  clock’)  time 
u  relative  to  (since)  some  past  event 
*  logical — a  number  which  monotonically  increases, 
but  not  necessarily  at  regular  intervals 

□  The  origin  of  the  benefit  function  axes  is  the  current 
time  tc  (value  of  the  system  clock) 

□  The  earliest  time  for  which  a  benefit  function  is  de¬ 
fined  is  called  its  initial  time 

the  latest  time  for  which  a  benefit  function  is  defined 
is  called  its  terminal  time  tr 


(some  systems  and  scheduling  algorithms  call  for  the 
specification  of  an  indefinite  terminal  time) 


□  A  benefit  function  is  evaluated  only  for  values  of  its 
time  parameter  between  the  current  time  and  its  ter¬ 
minal  time 


Deadlines  Are  Due  Times  Subject  To  A  Specific 
Collective  Temporal  Acceptability  Criterion 


□  A  special  case  of  a  sooner  time  t;  is  a  due  time  dis¬ 

tinguished  by  the  benefit  function’s  first  derivative 
having  on  infinite  discontinuity  at  ts=  t^ 

□  A  deadline  is  a  due  time  subject  to  a  collective  tempo¬ 
ral  acceptability  criterion  which  does  not  allow  the 
due  time  to  be  missed 

□  A  benefit  function  is  defined  as  hard  if  it  has 

a  a  zero  or  constant  negative  value  before  t,^ 

a  an  infinite  discontinuity  in  its  first  derivative  at  t, 
iftj>t, 

a  a  due  time  tc 

a  a  constant  value  between  t^  and  t^ 

a  a  constant  value  between  to  and  tr 
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A  Classical  “Hard  Daadlina”  Is  A  Special  Case 


□  The  most  common  meaning  of  a  classical  “hard  dead¬ 
line*— 

a  computation  which  completes  anytime  between  its 
initial  and  deadline  times  is  uniformly  acceptable,  and 
otherwise  is  unacceptably  tardy — 

corresponds  in  this  model  to 

*  a  hard  benefit  function  with 

*  deadline  tos  tr 

*  unit  binary  range  (0, 1) 


□  Classical  definitions  of  “hard  deadline’  vary  a  little 

*  they  generally  do  not  provide  for  ati>t, 

*  sometimes  the  range  of  this  function  is  (-00, 11;  a 
few  algorithms  define  the  range  as  (O.Ae), 
where  e  is  the  computation’s  execution  time  and  k 
is  a  proportionality  factor 


A  Released  Time  Constraint  May  Be  Effective 
Either  Immediately  Or  In  The  F  itur? 


□  A  time  constraint — and  thus  benefit  function — is 
made  known  to  the  scheduler  at  its  release  time 
(which  is  usually  a  scheduling  event) 

□  When  the  benefit  function  is  released,  its  initial  time 
maybe 

*  the  current  time — ^the  time  constraint  is  released 
at  the  time  it  is  to  take  effect  (i.e.,  at  t,s  tc) 


t,=  tc 

*  a  future  time — the  time  constraint  is  released  in 
advance  (i.e.,  t,  >  tc)  to  improve  scheduling 


te  t; 


(but  t/  tc  is  a  necessary  condition  for  the  compu 
tation  to  complete,  if  not  also  begin,  execution) 
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All  Benefit  Functions  Which  Are  Not  Hard  Are  Soft 

□  All  benefit  functions  which  are  not  hard  are  soft 


□  Soft  benefit  functions  can  have  arbitrary  values  before 
and  after  the  optimal  value  at  tg 


□  Soft  benefit  functions  need  not  have 

*  constant  values  on  each  side  of  t,.  and  tc 

*  expiration  times 


Realtime  Computations  Generally  Have 
Dynamic  Dependencies 


□  Expressing  or  releasing  a  benefit  function  relative  to  a 
future  time/event,  such  as 

*  the  completion  of  some  other  computation 

*  an  external  signal 

is  adding  a  (generally  dynamic)  dependency  to  the 
time  constraint 

□  Dynamic  dependencies  can  require  a  realtime  compu¬ 
tation  to  be  completed  at  a  time  yielding  zero  or  nega¬ 
tive  benefit,  when  a  computation 

*  has  been  initiated  and  cannot  be 
m  Stopped  (preempted  or  aborted) 
u  undone 

(such  as  one  related  to  a  physical  activity  in  the 
application  environment) 

*  would  block  another  if  not  completed,  despite  its 
consequential  zero  or  negative  benefit 

(which  can  require  indefinite  function  terminal  times) 

□  Dependencies  must  be  accommodated  in  conjunction 
with  time  constraints  according  to  some  specific 
scheduling  policy, 

and  thus  are  not  part  of  the  benefit  accrual  model  per 
se 
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Computatioiw  Also  H«v«  Rclativ*  ImportancM 


Urgency  And  Importance  Are  Used  Together 


□  Each  compulation  generally  also  has  a  relative  impor- 
tance — ^i.e.,  functional  criticality — with  respect  to  oth¬ 
er  compuUtions  contending  for  completion 

□  Importance  is  orthogonal  to  urgency 

*  a  computation  with  high  urgency  (e.g.,  a  near 
deadline)  may  not  be  highly  important 

*  a  computation  with  low  urgency  (e.g.,  a  far  dead¬ 
line)  may  be  very  important 

□  Importance  may  be  a  function  of  time  and  other  pa¬ 
rameters  that  reflect  the  application  and  computing 
system  state, 

and  can  be  represented  and  employed  similar  to  ur¬ 
gency 


♦  ' 

1 

- 

t 

Q  In  simple  cases  importance  may  be  a  constant,  and 
benefit  may  be  simply  urgency  scaled  by  importance 

*  urgency  and  importance  might  be  combined  prior 
to  execution  time 

a  computation’s  completion  might  be  expedited  hy 
elevating  its  benefit  for  the  remaining  execution 
time 


Q  In  more  general  cases  where  importance  needs  to  be  a 
variable,  ft  and  f  must  be  evaluated  together  dynam¬ 
ically  to  determine  the  benefit — 

e.g.,  as  some  function  of  the  ft  and  f,  functions,  giff  ft) 


imimt  mm»  U.  ifP> « 
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Realtime  Schedulers  Are  Usually  Presumed  To  Know  The  Benefit  Accrual  Model  Is  Based  On 

Something  About  Computation  Exucution  Durations  Benefit  Functions  And  Benefit  Accrual  Functions 


□  A  realtime  computation  has  an  execution  duration  e 
which  the  scheduler 

*  either  knows  prior  to  execution — importemt  for  re¬ 
altime  scheduling 

a  deterministically  (the  most  common  presump¬ 
tion) 

•  estimated 

«  stochastically  (i.e.,  in  expectation) 

«  non-stochastically — e.g., 

-  bounds 

-  rules 

*  or  does  not  know  prior  to  execution — limits  pi  “- 
dictability  of  realtime  scheduling 

□  This  duration  may  or  may  not  take  into  account  a  fore¬ 
cast  of  dynamic  dependencies 

□  Non-deterministic  durations  may  be  estimated  dy¬ 
namically  (during  the  computation's  execution)— e.g., 

*  conditional  probability  distributions 

*  execution-time  knowledge-driven  rules 


□  Benefit  Functions 

□  Benefit  Accrual  Functions 
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Th«  Sch«dul«r  Assigns  Exscution  Completion  Times 

□  A  benefit  accrual  model  scheduler  considers  all  re¬ 
leased  time  constraints  between  the  current  time  and 
its  horizon  tw— the  future-most  terminal  time 


□  It  assigns  the  estimated  execution  completion  times — 
and  consequently  the 

*  initiation  times 

*  ordering 

for  those  computations 
using  an  algorithm  which 

*  seeks  to  sufficiently  satisfy  the  scheduling — collec¬ 
tive  temporal  acceptability— criterion 

*  taking  into  account  dependencies  and  importances 

(such  as  earlieat-deadline-first  for  the  classical  “hard 
realtime’  criterion  of  all  computations  meeting  their 
deadlines) 


Collective  Optimality  Is  Not  Always  Defined  In  Terms  Of 
Unanimous  Individual  Optimums 

□  For  the  special  case  of  any  collective  temporal  accept¬ 
ability  criterion  defined  to  be  a 

*  unanimous 
a  optimum 

of  the  individual  temporal  acceptabilities, 

there  is  an  equivalent  criterion  defined  in  terms  of  in¬ 
dividual,  rather  than  collective,  optimums — e.g., 

4c  meet  gU  deadlines — meet  each  deadline 

*  maximize  all  benefits — maximize  each  benefit 

□  In  general,  collective  temporal  acceptability  is  not  de¬ 
fined  as  necessarily  unanimous  or  optimum  with  re¬ 
spect  to  the  individual  computations'  temporal  acccpt- 
abiUty — e.g.,  maximize 

*  the  number  of  deadlines  met 
4c  the  sum  of  the  benefits 

4:  the  number  of  computations  during  a  time  frame  T 
which  achieve  at  least  P  percent  of  their  maximum 
possible  benefit 

4c  the  probability  that  at  least  P  percent  of  the  com¬ 
putations  during  a  time  frame  T  will  achieve  their 
maximum  benefits 


I  g  Iltla  I 


In  The  Benefit  Accrual  Model 
Collective  Temporal  Acceptability  Is  Based  On 
Accruing  Benefit  From  Individual  Computations 


□  In  the  benefit  accrual  model,  collective  temporal  ac¬ 
ceptability  criteria  are  baaed  on 

4c  accruing  benefit  from  the  individual  computations 
in  a  set 

4c  in  a  manner  specified  by  a  benefit  accrual  function 
for  that  set 


□  This  is  general  enough  to  encompass  a  wide  range  of 
collective  temporal  aoceptabiUty  criteria 

*  the  unanimous  individual  optimum  cases  such  as 
traditional  ’hard  realtime,’ 

for  which  the  accrual  predicate  is  the  product  of 
the  individual  benefits  (assuming  the  usual  range 
of  (0,11) 

4c  cases  not  defined  as  necessarily  unanimous  or  op¬ 
timum  with  respect  to  the  individual  computa¬ 
tions'  temporal  acceptability, 

which  we  term  best-effort  scheduling 
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Our  New  Realtime  Computing  Paradigm  Is  Based  On 
The  Benefit  Accrual  Model  And  Best-Effort  Scheduling 

□  The  Benefit  Accrual  Model 

□  Best -Effort  Scheduling 
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Convantlonal  RMitim*  SchMiuling  Focusm  On 
Unanimous  Optimum  As  The  Criterion 


Realtime  Computing  Systems  Generally  Have  A  Wide 
Spectrum  Of  Mission-Critical  Timeliness  Needs 


□  Scheduling  principles  and  practices  which  are  real¬ 
time  by  our  definition  (i.e.,  based  on  satisfying  comple¬ 
tion  time  constraints)  have  until  recently  been  focused 
exclusively  on 

a  guaranieeing  that 

*  a  unanimous  optimum 
scheduling  criterion  will  be  met 

(e.g.,  the  classical  ‘hard  realtime*  case  of  guarantee¬ 
ing  that  all  deadlines  are  always  met) 

□  Even  though  the  traditional  ‘hard  realtime*  cases  are 
intended — and  commonly  imagined — to  achieve  this 
ideal 

*  physical  laws  (especially  in  decentralized  systems) 

*  or  the  intrinsic  nature  of  the  applications  (espe¬ 
cially  at  mission  management  levels) 

generally  make  it 

*  either  non-cost-ellective 

*  or  impossible 

(there  are  only  a  few  exceptions) 


□  In  general,  realtime  systems  need 

*  a  sufficient  number  of  computation  completion 
times  to  be 

*  sufficiently  likely 

*  to  be  sufficiently  acceptable  (perhaps  optimal) 

*  given  the  current  application  and  computer  sys¬ 
tem  circumstances 

*  (perhaps  over  a  wide  range  of  such  circumstances) 

where  each  instance  of  “sufficient*  is  application-spe¬ 
cific 

□  The  Benefit  Accrual  Model  provides  a  framework  for 
expressing 

*  “softer* 

a  time  constraints — in  the  sense  of  non-binary 
completion  time  acceptability 

m  scheduling  criteria — in  the  sense  of  non-unani- 
mous  and  non-optimum 

»  in  addition  to— and  in  the  same  manner  as — the 
conventional  “hard*  time  constraints  and  schedul¬ 
ing  criteria 

□  These  softer  needs  are  realized  with  best-effort  sched¬ 
uling  algorithms 
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Best-Effort  Scheduling  Seeks  To  Do  The  Best 
That  Is  Possible  Under  The  Current  Conditions 

□  “Best-effort*  (BE)  realtime  scheduling  algorithms  seek 
to  provide  the  “best* — as  specified  by  the  applica¬ 
tion —  computational  timeliness  they  can, 

given  the  current  application  and  computer  resource 
conditions 

□  This  concept, 

and  the  Time-Value  Function  progenitor  of  the  Bene¬ 
fit  Accrual  Model  as  a  framework  for  expressing  time 
constraints, 

were  originated  by  Jensen  in  1977  and  published  in  i98s 

□  The  first  generation  of  BE — on-line  (at  execution 
time) — scheduling  algorithms  emerged  from  Jensen's 
Ph.D.  students  in  his  Archons  Project  at  CMU,  for  the 
Alpha  asynchronous  decentralized  realtime  os  kernel 

*  Locke’s  algorithm  (ises) 

*  Clark’s  algorithm  (i99o) 

□  A  second  generation  of  on-line  BE  algorithms  is  being 
devised  as  part  of  a  recent  multi-university  effort  to 
establish  formal  performance  bounds  for  on-line  algo¬ 
rithms  in  general  and  certain  BE  ones  in  particular 

□  A  first  generation  of  off-line  BE  algorithms  is  being  de¬ 
vised  in  France 


II. 


Locke  Did  The  First  Best-Effort  Scheduling  Algorithm 

□  The  most  salient  characteristics  of  Locke’s  algorithm 

♦  allows  a  wide  variety — ^but  not  all  forms — of  Time- 
Value  Functions  (TVf’s) 

♦  intends  that  importance  be  reflected  by  scaling  the 
TVF  values 

♦  execution  times  are  defined  stochastically 

♦  when  underloaded,  schedules  Earliest-Deadline- 
First  (EDF)  to  meet  all  deadlines — which  in  general 
does  not  accrue  maximum  value 

♦  if  a  joh  arrival,  or  execution  time  overrun,  results 
in  a  sufficiently  high  probability  of  overload, 

jobs  are  set  aside  in  order  of  minimum  expected 
value  density  (expected  value/expected  remaining 
execution  time)  until  the  probable  overload  is  re¬ 
moved 

♦  the  scheduling  optimality  criterion  when  over¬ 
loaded  is  the  special  (but  reasonable)  case  of  max¬ 
imizing  the  sum  of  the  job  values  attained 

♦  does  not  deal  with  dependencies  (e.g.,  precedence, 
resource  conflicts) 

□  Locke  used  simulations  to  demonstrate  that  his  algo¬ 
rithm  performed  well  in  comparison  to  others,  such  as 
EDF,  for  a  number  of  interesting  overload  cases; 

but  provided  no  formal  performance  characterizations 
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Locke's  Algorithm  Has  Been  Used  Experimentally 


Clark's  Algorithm  Deals  With  Dependencies 


□  Versions  of  Locke's  algorithm  have  been  implemented 
and  experimentally  verified  to  be  superior  and  cost-ef¬ 
fective 

with  respect  to  traditional  realtime  scheduling  algo¬ 
rithms,  such  as  EOP  and  fixed  priority, 

for  a  number  of  interesting  cases — including 

*  in  the  Alpha  asynchronous  decentralized  realtime 

os  kernel 

m  a  battle  management  application  for  air  de¬ 
fense,  by  General  Dynamics  and  the  Archons 
Project  at  CMU,  in  1987 

a  a  ball-and-paddle  realtime  scheduling  evalua¬ 
tion  testbed  by  the  Archons  Project  in  1987 

which  also  added 

■  nested  time  constraints 

■  timeliness  failure  abort  processing 

*  in  the  Mach  2.5  os  kernel 

m  a  synthesized  realtime  workload,  by  the  Ar¬ 
chons  Project  in  1987 


□  The  most  salient  characteristics  of  Clark’s  algorithm 

♦  permits  only  rectangular  TVF’s,  whose  value  is  the 
job’s  importance 

*  execution  times  are  both  fued  and  known 

*  the  scheduling  optimality  criterion  is  the  special 
(but  reasonable)  case  of  maximizing  the  sum  of  the 
job  values  attained 

*  deals  with  dependencies  (e.g.,  precedence,  re¬ 
source  conflicts)  which  are  not  known  in  advance 

♦  selects  jobs  to  be  scheduled  in  decreasing  order  of 
value  density  (VD) 

♦  selected  jobs  are  scheduled  EDF,  which  maximize.s 
summed  value  for  the  TVf's  he  permits 

*  when  each  job  is  scheduled,  so  are  those  on  which 
it  depends 

*  if  necessary,  precedent  jobs  are  aborted  or  their 
deadlines  are  shortened  (whichever  is  faster),  to 
satisfy  the  deadline  of  the  dependent  job 

□  Clark's  formal  analysis  and  simulations  showed  that 

♦  when  overloaded,  if  the  algorithm  can  apply  all 
available  cycles  to  jobs  that  complete,  no  other  al¬ 
gorithm  can  accrue  greater  value  given  the  cur¬ 
rent  knowledge 

♦  since  future  jobs  are  unknown,  there  is  no  perfor¬ 
mance  guarantee 
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R«c«nt  Work  Exploros  Competitive  Factor  Bounds 

□  Researchers  at  urexas,  NYU,  and  UMass  have  recently 
developed  limited  performance  bounds  for  on-line  re¬ 
altime  scheduling 

*  competitive  factor  measures  the  value  an  algo¬ 
rithm  guarantees  it  wall  achieve  compared  to  a 
clairvoyant  scheduler 

*  considers  only  rectangular  TVF’s,  and  execution 
times  which  are  (mostly)  both  fixed  and  knowm — 

like  Clark’s  algorithm,  this  means  that  scheduling 
by  EDF  when  underloaded  not  only  meets  all  dead¬ 
lines  but  maximizes  summed  value 

*  if  ail  values  are  proportional  to  execution  time,  an 
on-line  algorithm  can  guarantee  a  competitive  fac¬ 
tor  of  no  more  than  V4 

*  the  performance  bound  is  lower  when 

■  value  is  not  proportional  to  execution  time 
a  the  ratio  of  maximum  to  minimum  vd  increases 
a  execution  times  are  not  fixed  and  known 

*  confirms  that  performance  guarantees  are  impos¬ 
sible  if  workload  characteristics  are  unknown 

*  acceptable  performance  assurances  may  be  possi¬ 
ble  when  limited,  reasonable,  workload  informa¬ 
tion  is  knowrn 


□  Their  algorithms  are  devised  primarily  for  the  pur¬ 
pose  of  illustrating  the  performance  bound 


Maynard  Is  Addressing  Overload  Behavior 
With  Best-Effort  Schedulers 

□  Maynard’s  thesis  is 

S  improving  the  understanding  of  the  overload  be¬ 
havior  of  on-line  realtime  scheduling  algorithms 

*  developing  techniques  for  defining  benefit  func¬ 
tions  to  yield  desired  overload  behavior 

□  Its  scope  includes  best-effort  schedulers  that  use  ben¬ 
efit  density  as  the  load  shedding  criterion 

□  The  work  to  date  provides  an  algorithm  for  setting  job 
importance  values  to  impose  a  strict  priority  ordenng 
among  selected  groups  of  jobs 

□  This  allows  integration  of  results  from  off-line  sched- 
ulability  analysis,  to 

*  provide  “guarantees"  when  necessary  and  possible 

*  retain  adaptability  of  dynamic  scheduling 

□  His  simulations  support  the  validity  of  the  approach 

□  He  is  also  creating  tools  which  help  the  system  design¬ 
er 

*  select  and  adapt  suitable  scheduling  algorithms 
for  specific  applications 

*  choose  appropriate  job  importance  values 
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Thar*  Is  Somswhat  Rsistsd  Work  In  Other  Fields 


□  The  most  closely  related  work  to  Best-E(Tort  realtime 
scheduling  is  Cost-Based  Scheduling  for  queueing  and 
dropping  network  packets,  done  at  Stanford 

*  a  cost  function  specifies  the  cost  per  unit  length  of 
queuing  delay  for  a  packet  as  a  function  of  time 

a  'ickets  have  only  non-decreasing  cost  functions 

a  instead  of  creating  a  schedule,  the  algorithm 
queues  the  next  packet  which  it  estimates  would 
cost  the  most  to  delay 

a  cost  is  calculated  using  a  estimation  of  future  cost 
that  would  be  incurred,  which  is  the  same  for  all 
packets 

a  the  optimization  objective  is  to  minimize  the  aver¬ 
age  delay  cost  incurred  by  all  packets 

a  dependencies  are  not  considered 

a  their  simulations  show  that  the  algorithm  per¬ 
forms  well  compared  to  the  standard  packet  queu¬ 
ing  algorithms,  and  Locke’s  algorithm, 

for  certain  workloads — packets  averaging  unit 
length,  in  near  fully  loaded  conditions 

□  These  premises  do  not  correspond  well  to  workload 
characteristics  of  general  interest  in  realtime  compu¬ 
tation  job  scheduling 
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Best-Effort  Benefit  Accrual  Scheduling 
Exacts  A  Higher  Price  Than  Simpler  Approaches 

□  Benefit  functions  and  best -effort  realtime  scheduling 
edgorithms 

*  utilize  more  application-supplied  information 
than  is  usual 

*  place  specific  requirements  on  the  kind  of  schedul¬ 
ing  mechanisms  that  must  be  provided  (i.e.,  in  the 
OS  kernel) 

*  and  thus  exact  a  higher  computational  price  than 
w'nen  little  or  no  such  information  is  used 

□  In  many  (if  not  most)  cases,  high  cost/performance 
can  be  attained  by  good  engineering 

□  Much  of  the  price  can  be  paid  with  inexpensive  hard¬ 
ware 

*  higher  performance  processors 

*  a  dynamically  assigned  processor  in  a  multipro¬ 
cessor  node 

*  a  special-purpose  hardware  accelerator  (analo¬ 
gous  to  a  floating-point  co-processor)  in  a  unipro¬ 
cessor  or  multiprocessor  node 


Best-EffortBenefitAccnialSchedulingWillBeSupported 
In  A  New  Version  Of  The  OSF  Mach  3  Standard 

□  Version  S.o  of  the  Mach  3  microkernel  standard  from 
OSF  has  no  realtime  capabUities 

□  To  create  realtime  functionality  for  subsequent  ver¬ 
sions  of  the  OSF  Mach  3  microkernel  standard, 

a  team  of  organiza-ions  is  collaborating — primarily 

*  Digital  Equipment  Corp.’s  Libra  program 

*  OSF’s  Research  Institute 

*  wpi’s  Center  for  High  Performance  Computing 

*  SRI  International 

with  additional  funding  from 

*  DARPA 

*  VSAF  Rome  Labs 

*  Digital  early-adopter  customers 

□  This  new  realtime  functionality  will  include 

*  kernel  mechanisms  to  implement  virtually  any 
scheduling  policy  specified  by  the  client — 

specifically  includiug  best-effort  benefit  accrual 
ones 

*  a  scheduling  policy  mterface  to  the  kernel  mecha¬ 
nisms  that  facilitates  the  creation,  maintenance, 
and  replacement  of  policies 
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Realtime  Trans-Node  (Alpha  Kernel)  Threads 
Are  Also  Being  Incorporated  Into  The  Mach  3  Standard 

□  The  same  team  is  also  incorporating  the  Alpha  ker¬ 
nel’s  realtime  trans-node  threads  into  forthcoming 
versions  of  OSF’s  Mach  3  standard — 

this  will  greatly  improve  the  ability  to  construct  asyn¬ 
chronous  decentralized  systems  (among  other  things) 
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As  an  interactive  design  tool 

—  The  user  provides 

o  task  system  description  (an  annotated  data-flow 
graph  of  the  task  system), 
o  system  resource  description,  and 
o  information  on  additional  external  constraints. 

—  Outputs  produced  by  PERTS  include 

o  processor  and  resource  requirements, 
o  sample  task  partitions  and  allocations, 
o  sample  schedules  and  memory  layouts, 
o  performance  predictions,  and 
o  suggested  design  changes  and  tests. 

As  a  development  and  evaluation  tool 

—  The  user  provides 

o  annotated  source  code  or  object  interface 
definitions,  and 
o  system  description. 

—  PERTS  can  provide 

o  a  simulated  (or  emulated)  target  environment,  and 
o  performance  profile. 
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scheduling  and 
resource-access  control 


processors 


scheduling  and 
resource-access  control 


Task  System  Description 


—  Ready  Time  (0) 

—  Deadline  (infinity) 

—  Period  (infinity) 

—  Phase  (0) 

—  List  of  resource  requirements 
(types,  units  and  required  intervals) 

—  List  of  optional  intervals  (null) 

—  In  type  (AND) 

—  Out  type  (AND) 

—  Laxity  type  (better-late-than-never) 
aiUtI hates  of  (y  tosii:  a  onii  of  tocrft 


Other  Input  Information 


System  description  —  a  list  of  resources,  each  defined 

by  parameters  including 

—  acquisition  time  (time  required  to  acquire  an  idle 
resource) 

—  de-acquisition  time  (time  required  to  release  a 
resource) 

—  latency  time 

—  context  switch  time  (time  to  switch  the  resource 
from  one  task  to  another  if  preempted) 

—  preemptability  (whether  the  resource  must  be  used 
serially) 

—  maximum  number  of  owners 

—  number  of  current  owners 

External  Constraints  —  Arbitrary  constraints  that 

cannot  be  deduced  from  task  and  resource  descriptions. 

Examples  are  maximum  allowable  processor  utilization, 

intentional  idle  resources,  nonpreemptive  tasks,  etc. 


Schedulability  Analysis  System 
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Task  System  Description 
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End-to-End  Scheduler  ✓ 
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Sample  Schedule 


Scheduling  Directives 
to  Testbed 
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•  Variations  in  processing  time  and  resources  required 
by  individual  tasks  due  to 

—  data-dependent  execution 

—  effects  of  performance  enhancing  features 

—  resolution  and  error  in  processor  time  and  resource 
usage  measurements,  etc. 

•  Variations  in  dispatching  and  execution  orders  when 

—  tasks  content  for  resources 

—  there  are  data  and  control  dependencies 

—  ready  times  of  tasks  are  arbitrary 

•  Variations  in  the  number  of  tasks 


ARE  UNAVOIDABLE 
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A  priority-driven  or  list  scheduling  algorithm 

—  assigns  priorities  to  tasks, 

—  makes  scheduling  decisions  and,  possibly,  alters 
task  priorities 

o  when  any  task  becomes  ready  and 
o  when  any  task  completes,  and 

—  at  each  scheduling  decision  time,  executes  the 
task  with  the  highest  priority  among  all  ready 
tasks. 

All  algorithms  that  never  leave  the  processor(s)  idle 
intentionally  are  priority-driven  algorithms. 

Examples  are  rate-monotonic,  earlist-deadline-first, 
shortest-processing-time-first  and  first-in-first-out. 
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L(R)  U(/?) 


0  2  4  6  8  10  12  14  16  18  20  22  24  26 


Ti  =  (  8,  5  ),  72  =  (  22,  7  ),  73  =  (  26,  4.5  ) 


We  have  methods  to  predict  such  behavior 
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5c^)edutcibllit^  c/  feriodCc  tasks 

'fi'xeci  ptt'or^'i^'es 

on  one  processor 


An  Example  Illustrating  the  Unacceptable  Performance  of 
the  Rate-monotone  Algorithm  for  Multiprocessor  Scheduling 


Schedule  the  /2+I  jobs  (1, 2e),  (1, 2£),  •••,(!,  2£),  (l+£,  1) 
on  n  processors  using  the  rate-monotone  algorithm 


0  2£  1  l+£ 

t 

missed  deadline 


U 


=  n 


2z  1 

—  + _ 

1  l+£ 


1 


Solution:  statically  bind  jobs  to  processors 


376 


6  independent  tasks  on  3  processors  with 
priority  list  =  (T | ,  7'2,  7'3, 7’4, 7'5,  Tg) 


Processing  times:  4,  6,  5,  5,  2,  3 
Ready  Times:  0,  0,  0,  4,  3,  5 


Processing  times:  3.5,  6,  5,  5,  2,  3 
Ready  Times:  0,  0,  0,  4,  3,  5 


Unexpected  Behavior  of  Priority-Driven  Algorithms 


(Ti,  72.  73, 


74,  75,  75,  77,  7g,  79 ) 


■*  Suppose  that  we  have  four  processors  instead. 
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Suppose  that  tasks  have  shorter  processing  times 


T,:2  Ty\  T^:l 


( r  1,  72,  73,  74,  75,  7^,  Tj,  7g,  79  ) 
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Scheduling  to  Meet  Timing  Constraints 

Remaining  problems  in  the  framework  of 

—  the  periodic-job  model  and 

—  the  complex-job  model 

Problems  yet  to  be  solved 

—  Scheduling  to  meet  end-to-end  deadlines 

o  examples  from  tightly-coupled  and  loosely-coupled 
systems 

o  variations  of  the  end-to-end  scheduling  problem 
o  related  problems,  existing  solutions,  and  future  work 

—  Dynamic  scheduling  (and  monitor-based  scheduling) 

o  costs  and  benefits  of  dynamic  strategies 
o  examples  of  unstable  and  oscillatory  behavior 
o  needed  solutions,  theories  and  supporting  data 

—  Scheduling  to  enhance  dependability 

o  scheduling  replicated  tasks  to  mask  errors 
o  scheduling  imprecise  tasks  to  increase  availability 

—  Scheduling  to  meet  deadlines  with  high  probability 

o  model  validation  and  calibration 
o  performance  profiling  techniques,  tools  and 
experiment  designs 


Suppose  that  tasks  are  less 


I 


T^\2  T^A  Ty\ 
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£nd-to-End  Scheduling 


Given: 


—  a  system  of  physically  or  functionally  distinct  types  of 
processors 

—  jobs  containing  tasks  to  be  executed  in  turn  on  different 
types  of  processors  and  having  end-to-end  deadlines 


Find:  A  schedule  meeting  end-to-end  deadlines  whenever  such 
schedules  exist. 
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Problems  in  End-to-end  Scheduling 


supporting  information 


•  There  are  solutions  emerging  for  the  cases  where 

—  global  information  is  available  and  current; 

—  global  information  is  available  but  may  be  old,  and 
performance  optimization  is  not  important. 

•  Solutions  are  needed  in  all  other  cases. 
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End-to-End  Scheduling  in  a  Tightly-Coupled  System 


sensor 

input 


sensor 

input 


GM  data 
transfer 


signal 

processing 


data 

processing 


deadline 


Typical  assumptions: 

—  Global  information  on  load  condition  and  processor  status 
is  complete  and  current. 

—  Rescheduling  is  necessary  only  when  mode  changes. 
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End-to-End  Scheduling  in  Loosely-Coupled  Environments 


•  Examples: 

—  Scheduling  jobs  in  remote  controllers,  command  and 
control  systems,  process  control  systems,  etc. 

—  Routing  and  sequencing  real-time  communications 

•  Typical  assumptions 

—  Global  information  is  not  available,  incomplete,  or 
complete  but  not  current. 

—  Scheduling  is  done 

o  at  configuration  time  and  major  outage 
o  during  mode  change 
o  during  session  (or  connection)  establishment 
o  on  a  per  job,  per  task,  or  per  message  basis 
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End-to-End  Scheduling  without  Global  Information 

•  The  only  known  approach:  first  distribute  the  overall  slack 
time  of  each  job  to  the  individual  tasks  in  it. 

•  Components  of  the  needed  solutions  include 

—  on-line  and  nearly-on-line  scheduling  algorithms 
(How  much  can  partial  and  old  information  help?) 

—  algorithms  for  scheduling  tasks  to  minimize  error  or 
to  minimize  the  number  of  discarded  optional  tasks 

—  dynamic  and  monitor-based  algorithms 
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•  What  makes  an  algorithm  dynamic? 


observed 

scheduling 

run-time 

system 

demand 

decisions 

schedule 

behavior 

A 

A 

1 

1 

i 

observed 

behavior 


•  Dynamic  algorithms  are  needed 

—  when  the  demands  on  the  system  (and  hence  the  task 
paramters)  are  not  known  completedly  or  a  priori, 

—  when  the  system  must  respond  to  frequent  changes  in 
demands  or  configuration  quickly. 

•  Examples  of  dynamic  algorithms  include 

—  priority-driven  algorithms  -  c^iACu,e 

—  local-balancing  algorithms 

—  adaptive  algorithms 


Critical  issues:  cost  vs  benifit,  stability  and  convergence 
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A  Stability  Issue:  Oscillatory  Behavior 

(An  example  from  Data  Networks  by  Bertsekas  and  Gallager) 


Rate  on  Link  1  Rate  on  Link  1 


A  Stability  Issue:  Convergence  of  Adaptive  Algorithms 

(An  example  from  Data  Networks  by  Bertsekas  and  Gallager) 


Link  1 
Capacity  C 


Link  2 
Capacity  C 
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Critical  information  about  a  dynamic  algorithm  needed  to 
support  its  safe  usage 

—  cost  vs  benifit 

—  regions  of  stability  and  oscillatory  behavior 

—  ideal  operating  region  and  parameters  tunable  to  keep 
the  operating  point  in  the  region 

—  worst-case  operating  region 


parameter  2 


parameter  1 
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Validation  and  Verification  of  Real-Time  Systems 

Developments 


March,  1993 


C.  Douglass  Locke 


IBM  Federal  Systems  Company 
Bethesda,  MD,  USA 
locke@vnet.ibm.com 
(301)  49S-1496 
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Outline 


Introduction 

Prerequisites  to  Verification  and  Validation  f 

Verification  and  Validation 


Conclusions 


>  _ 

Introduction 


Definitions 

•  Validation  -  Determination  that  the  -*^olution  can  be 

i  made  to  work  as  specified. 

•  Verification  -  Determine  that  the  solution  works  as 
specified. 

Prerequisites  to  Verification  and  Validation  of  Real-Time 
Systems 

^  •  Timing  &  Performance  Requirements 

•  Analyzable  Architecture 

•  Resource  Usage  Estimates  (processing  load,  network 

^  load,  I/O  rates) 

•  Measurement  Methodology 

•  Analysis  Tools 

I 

Verification:  not  done  only  following  implementation 

I  •  Continuous  process. 

•  Not  just  something  to  do  at  the  end  of  the  imple¬ 
mentation. 

I  •  Not  to  be  confused  with  an  acceptance  test. 


) 


2 
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Timing  &  Performance  Requirements 


Timing  and  performance  requirements  are  generally 
derived  requirements 

•  Actual  system  requirements  leading  to  timing  and 
performance  requirements  are,  e.g.,  accuracy,  avail¬ 
ability,  human  responsiveness. 

•  Many  system  requirement  specifications  omit  timing 
and  performance  requirements. 

•  When  expressed,  they  are  almost  always  end-to-end 
requirements. 


Analyzable  Software  and  Systems 

Architecture 


Architecture  is  the  set  of  high  level  design  decisions 
defining: 

•  Processors 

•  Communications 

•  Concurrency 

,•  High-level  data  definition  (generally  as  sets). 

Time  constraints  must  drive  the  software  and  systems 
architecture. 

•  Operating  Systems 

•  Communications  p^^otocols 

•  Languages 

•  Databases 

•  GUI's 
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Resource  Usage  Estimates  / 


All  resources  must  be  considered,  including: 

•  Processing  load 

•  Network  load 

•  I/O  rates 

Estimates  must  be  made  for  each  schedulable  entity: 

•  Tasks 

•  Processes 

•  Threads 

Timing  requirements  must  be  decomposed,  resulting  in: 

•  Periodicity 

•  Aperiodic  arrival  rates 

•  Aperiodic  interarrival  times 
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Measurement  Methodology 

Some  means  for  measuring  resource  usage  required: 

•  Operating  System  trace  functions 

•  Communications  monitors 

•  I/O  recording  functions 

•  ICE  hardware 

•  Logic  Analyzers 

•  Application-level  measurements 
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Analysis  Tools 


Dependent  on  scheduling/architectural  models  used, 
e.g.: 

•  RMA  tools 

•  Composite  temporal  merit  computation 


Verification  and  Validation 


Continuous  development  life-cycle  activity: 

•  Architecture  Verification  -  Determination  that  the 
architecture  can  meet  specified  performance  and 
timing  constraints. 

•  Design  Verification  -  Determination  that  the  design 
implements  the  architecture,  and  will  thus  meet  the 
specified  performance  and  timing  constraints. 

•  Code  Inspection  -  Formal  or  informal  determination 
that  the  coded  software  implements  the  design,  and 
will  thus  meet  the  specified  performance  and  timing 
constraints. 

•  Model  Validation  -  Measurement  of  implementation 
components  that  directly  address  archecture,  and 
will  thus  meet  the  specified  performance  and  timing 
constraints. 

•  Performance  Verification  -  Demonstration  that  the 
finished  product  meets  all  externally  visible  specifi¬ 
cations,  including  functional,  accuracy,  availability, 
and  timing. 
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Conclusions 


Real-time  validation  and  verification  fundamentally 
dependent  on: 

•  Early  determination  of  architecture,  derived  timing 
requirements,  and  load 

•  Continuous  tracking  of  development  against  esti¬ 
mates 

•  Early  resource  utilization  contingency  management 

In  short,  resource  management  for  real-time  must  be 
managed  in  the  same  way 

as  cost. 
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DOMAIN  ANALYSIS 


ESPECIALLY  IMPORTANT  FOR  CREATION  OF  PRODUCT  LINE 
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VNs  ARE  EVENTUALLY  MAPPED  TO  PNs 
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-  “SYSTEM  DESIGN”  PERFORMED  LATER 

-  MUST  CONSIDER  DISTRIBUTED  ISSUES  EARLY 

-  USEFUL  FOR  DEVELOPING  PRODUCT  LINE 


SYSTEM  DESIGN 
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-  CLASS/OBJECT  RELATIONS  (DYNAMIC  MODEL) 

-  WHICH  METHOD  DO  WE  PICK? 

-  WHERE  IS  DISTINCTION  BETWEEN  ANALYSIS  AND  DESIGN? 


SOFTWARE  REQUIREMENTS 
ANALYSIS 
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SOFTWARE  DESIGN 


-  DESIGN  GUIDELINES 

-  STARTS  WITH  INITIAL  PROTOTYPE 

-  CONTINUES  THROUGH  FINAL  BUILD 
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-  DATA  MODELING  (ERDs) 

•  DOMAIN  ANALYSIS 

•  SYSTEM  REQUIREMENTS  ANALYSIS 

•  SOFTWARE  REQUIREMENTS  ANALYSIS 
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OT,  RTSA,  AND  ART  COMPLEMENT  EACH  OTHER 
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EFFICIENT  IPC  MECHANISM 


REAL-TIME  DESIGN  AND 
SCHEDULING  THEORY 
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NO  STANDARD  IPC  MECHANISM  AVAILABLE 


VALIDATING  LARGE,  REAL-TIME, 
DISTRIBUTED  SYSTEMS 
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Question:  How  to  collect  data  in  real-time  sys-  !  (  IDA 
terns  without  disturbing  system  timing?  ’ 


Effecting  time  more  a  problem  than  effecting  space 
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Permanent  software  data  collection:  leaving  the 
real-time  data  collection  subsystem  in  the  system. 
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Outline 

•  Need  of  Predictability  and  Scalability 

•  Predicting  System  Real-Time  Capability 

•  Scalable  RT  Scheduling  Strategies 

•  Summary 
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Trend  of  Real-Time  Computing  (cont.) 

•  Massive  parallel  processing 

•  High  speed  networking 

Resource  Management  Overhead 
System  Throughput 


Resource  management  must  be  scalable. 
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•  Requirements 

•  Utilization  based  predications 

•  Successes  and  problems 
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Requirements  of  Performance  Predictor 


Specification 


Performance 

Predictor 


Ok 

Not  OK 
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•  Predicating  system  safety  margin  is 

the  most  essential. 

==>  Predication  must  be  stable 


•  Predication  should  not  depend  on 
detailed  system  specification. 


==>  Utilization  based  predication 
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Utilization  based  predication: 

•  Worst  Case  Achievable  Utilization 

If  the  demand  is  less  than  WCBU, 
real-time  constraints  are  met. 

This  is  a  measure  of  the  worst  case. 
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Progress  in  WCAU 

•  C  Liu  et  al  (1973) 

RMS  for  a  single  cpu  system, 

WCAU  =  69%, 

•  CMU  Research  Group  (1980s) 

WCAU  for  various  environments. 

•  TAMU  Research  Group  (1992) 
WCAU(FDDI)  =  33%. 
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Successes  with  WCAU 
DMS/SSFP 
Future  Bus 
SAFENET/FDDI 
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Problems  with  WCAU 


•  Existing  work  is  limited  to 
single  system  components. 

•  Many  interesting  architectures 
might  not  have  meaningful  WCAU. 

•  Does  not  provide  any  information 
if  the  demand  >  WCAU. 
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Guarantee  Probability 

•  Definition 

GP(U)  = 

Prob(A  set  of  requests 

are  guaranteed  |  demand  =  U) 

•  A  measure  of  both  average  and  worst 
cases. 

•  A  generalized  notion  of  WCAU: 

WCAU  =  max(  U  |  GP(U)  =  1). 
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Guarantee  Probability  for  FDDI 


Nwntar  of  suinns  N  .  100. 
Mmmuin  Rmod  •  10  m»c. 
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Utilization  based  predication  helps  to 

•  make  design  decisions 

•  establish  management  confidence 

•  reduce  testing  and  integration  cost 
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•  Is  EDF  scalable? 

•  R-Shell  — 

a  new  RT  scheduling  strategy 
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Is  Earliest-Deadline-First  Scalable? 


•  Scheduling  algorithms  for  centralized 
systems  were  well  studied. 


•  Many  optimal  algorithms  were  developed. 

•  Optimality  is  proved  by  assuming  zero 
scheduling  overhead. 


•  In  a  distributed  environment,  scheduling 
overhead  impacts  the  performance. 
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An  Example:  scheduling  real-time 
messages  in  a  token  ring  network 

Three  protocols  to  to  studied: 

•  Simple  token  passing  protocol 
—  Non  EDF 

•  Priority  driven  protocol 

—  Approximate  EDF 

•  Window  protocol 

—  Exact  EDF 

^  '  —  ■■■  IDA  1993 
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In  a  parallel/distributed  system, 


_  .  Schedduling  Policy 

Performance  = - 

Scheduling  Overhead 


t 


R*Sh6ll  ■"* 

A  New  RT  Scheduling  Methodology 


Objectives  of  R-Shell  System 


For  distributed  parallel/distributed 
real-time  applications 


Use  of  a  scalable  scheduling  method 
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Scheduling  in  R-Shell 


•  Semi-dynamic 


Partial  schedules  are  generated  at 
compilation  time 


•  Application  autonomic 

Each  application  has  a  scheduling 
agent. 


There  is  no  system  scheduler! 
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Suminary 


•  Real-time  scheduling  in  parallel/distributed 
system  is  challenging 


Predictability  and  scalability  are  two 
key  issues 

Further  systematic  exploration  should 
result  in  cost-effective  design  methodology 
for  new  generation  RT  computing  systems 


'  IDA  1993 ' 


A82 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
OMB  No.  07044)188 


Public  reponuTg  burden  for  Uus  collection  of  mfonnation  ts  estimaied  to  average  1  hour  per  response,  including  the  lime  for  reviewing  instrucuons.  searching  existing  data  sources, 
gathering  and  maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this 
collection  of  information,  including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services.  Directorate  for  In/ormanon  Operations  and  Reports.  12]  5  JeOerson 
Davis  Highway.  Suite  1204.  Arlington.  VA  22202-4302.  and  to  the  Office  of  Managemoit  and  Budget.  Paperwork  Rcducuon  Projca  (07(»4-0188).  Washington.  tV'  205l»3 


1.  AGENCY  USE  ONLY  (Leave  blank) 


2.  REPORT  DATE 

July  1993 


3.  REPORT  TYPE  AND  DATES  COVERED 

Final 


4.  TITLE  AND  SlIBTITI^ 

Proceedings  of  the  Workshop  on  Large,  Distributed,  Parallel  Architecture. 
Real-Time  Systems 


6.  ACTHORlS) 

Norman  R.  Howes,  Dermis  W.  Fife.  Jonathan  D.  Wood 


5.  R'NDINCi  NUMBERS 

MDA  903  89  C  (KX)3 
Task  T-R2-597.2 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ESl 

Institute  for  Defense  Analyses  (IDA) 

1801  N.  Beauregard  St. 

Alexandria,  VA  22311-1772 


8.  PERFORMING  ORGANIZATION  RO'OKT 
NUMBER 

IDA  Document  D-1425 


9.  SPONSORING/MONITORING  AGENCY  NAME(Sl  AND  ADDRESS(ESi 

Ballistic  Missile  Defense  Office 
The  Pentagon,  Room  1E149 
Washington.  DC  20301-7100 


10.  SPONSORING/MONITORING  AGENCY 
REPORT  NX'MBER 


12a.  DISTRIBUTION/AVAILABtLITY  STATEMENT 


Approved  for  public  release;  unlimited  distribution;  8  June  1994. 


12b.  DISTRIBUTION  CODE 

2A 


13.  ABSTRACT  (Maximum  200  words) 

The  workshop  on  Large,  Distributed,  Parallel  Architecture,  Real-Time  Systems  was  sponsored  by  the  Ballistic 
Missile  Defense  Office  (BMDO)  and  the  NASA  Ames  Research  Center  and  hosted  at  IDA  in  March  1993.  The 
purpose  of  the  workshop  was  to  obtain  expert  opinions  on  the  following  questions:  ( 1 )  What  is  the  best  design 
me^odology  for  this  class  of  systems?  (2)  What  is  the  proper  relationship  between  design  theory  and 
scheduling  theory?  (3)  What  is  the  best  method  for  validating  this  class  of  systems?  and  (4)  What  are  the  most 
promising  areas  where  resources  might  be  applied  for  near-term  benefits?  Twenly-tJuee  experts  from 
academia,  government  and  industry  were  invited  to  attend  of  which  seventeen  accepted.  In  total,  there  were 
twenty-three  participants  including  sponsors  and  IDA  research  staff.  The  invitees  contributed  position  papers 
in  advance  of  the  workshop  and  presented  talks  from  transparencies.  These  position  papers  and  transparencies 
comprise  the  contents  of  the  proceedings.  The  informal  discussions  that  took  place  at  the  workshop  are 
summarized  in  the  introductory  material  by  the  IDA  research  staff.  The  advice,  opinions  and  methods  of  the 
participating  experts  are  intended  to  help  BMDO  and  NASA  in  the  development  of  their  respective  software 
technology  plaiming. 


14.  SUB.IECTTERM.S 

Real-Time;  Parallel  Real-Time;  Distributed  Real-Time;  Real-Time  Scheduling; 
Real-Time  Design  Methods. 


15.  NUMBER  OF  PAGES 

506 


16.  PRICE  CODE 


17.  SECURITY  CLAS.SIFICATION  18.  SECURITY  CLASSIFICATION  19.  SEnTUTY  CLASSIFICATION  20.  LIMITATION  OF  ABSTRACT 
OF  REPORT  OF  THIS  PAGE  OF  ABSTRACT 

Unclassified  Unclassified  Unclassified 


NSN  7540-01-280-5500 


.Standard  Form  298  (Kcv.  2-89) 
Prescribed  bv  ANSI  Sid.  Z39- 18 
298-102 


